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INTRODUCTION 

Video  data,  coming  from  satellites,  space  probes,  air  and  space  labs,  appears  to  be  one  of  the  most 
significant  information  channel  in  remote  sensing,  aerospace  researches,  monitoring  of  Earth  resources. 
The  registered  images  often  turn  out  to  be  distorted  due  to  non-ideal  characteristics  of  capturing  devices 
and  measuring  conditions.  The  volume  of  data  exceeds  the  transmitting  channel  bandwidth,  and  finally 
the  receiver  often  desire  to  extract  only  some  interesting  information  from  incoming  data.  Thus  the 
important  part  of  this  problem  is  image  processing,  including  image  restoration,  enhancement,  and 
compression. 

The  report  contains  the  results  of  researching  and  developing  effective  image  enhancement  and 
restoration  methods,  and  of  constructing  powerful  wavelet-based  image  compression  algorithm.  An 
approach  to  decision  of  the  problems  is  based  on  proposed  multiscale  image  model,  which  describes  an 
image  as  a  combination  of  statistically  and  semantically  independent  components,  containing  details  of 
different  sizes  and  different  information  classes.  Statistic  distinctions  of  these  components  allow  to  find 
effective  algorithms  for  decomposition  of  received  image  to  several  information  components.  Discussing 
about  image  decomposition  we  mean  the  splitting  of  an  image  onto  the  set  of  such  components.  It  offers 
the  possibility  to  take  from  an  image  only  the  component  of  interest,  to  avoid  redundunt  information  from 
the  following  analysis,  and  to  create  decomposition-based  image  processing  methods.  Basing  on  this 
approach,  algorithms  for  image  enhancement,  restoration,  and  compression  are  developed  and  considered 
in  the  project.  In  particular,  algorithms  for  image  smoothing,  local  contrasting,  automatic  amplitude 
correction,  blurred  image  restoration,  decomposition-based  image  compression. 

Speaking/  about  image  enhancement  and  image  restoration  we  always  use  some  criteria  for 
evaluating  of  image  quality.  Usually  such  evaluation  criteria  are  also  qualitative  and  informal  -  the 
estimate  of  visual  image  quality  is  closely  connected  with  the  human  perception  [1].  Typical  estimates, 
analogous  to  mean  square  deviation,  are  enough  far  from  the  visual  quality,  moreover,  they  are  applicable 
only  to  a  couple  of  some  images,  not  to  a  single  one.  Variations  may  be  more  adequate  and  helpful 
estimates.  We  would  like  to  apply  to  the  task  of  image  estimate  one  little-known  two-dimensional 
variation,  proposed  by  Kronrod  (published  in  [2]).  This  estimate  is  based  on  notation  an  image  as  some 
continual  surface  over  two-dimensional  space  of  coordinates.  In  discussing  formal  measures  we  rely  on 
the  two-dimensional  variation,  which  is  based  of  continual  image  model. 

Discussing  about  image  measures  we  will  consider  basic  concepts  of  the  continual  image  modeling, 
assuming  any  discrete  model  as  some  approach  to  the  continual  one.  On  this  base  we  wiU  discuss  metric 
characteristics  on  an  image  set,  the  question  of  image  smoothness,  and  two-dimensional  image  variation. 
Continual  model  will  be  used  for  the  definitions  and  explanations  of  it.  For  the  developing  of  image 
decomposition  and  enhancement  algorithms,  we  will  use  discrete  model.  Also  the  discrete  analog  of  two- 
dimensional  variation  will  be  used  for  experimental  estimating  of  image  smoothness. 

The  report  contains  three  following  chapters:  i)  mathematical  image  models,  ii)  image 
decomposition  and  enhancement,  and  iii)  wavelet-based  image  coding.  In  models’  domain  the  discrete 
models  for  small  local  vicinity  and  large  image  fragment  are  proposed,  which  are  later  used  for 
constructing  different  image  processing  algorithms.  Continual  image  model  is  used  for  discussing  about 
some  new  image  measure  -  two-dimensional  variation,  and  as  the  appropriate  basis  for  image  restoration 
algorithm.  Image  decomposition  is  treated  as  some  specific  image  smoothing,  and  obtained  results  are 
later  used  for  image  enhancement.  Algorithm  for  automatic  gray  scale  image  correction  is  constructed 
basing  on  of  local  image  vicinity  model.  The  task  of  restorating  images,  distorted  by  convolution-like 
operator,  is  decided  on  the  basis  on  continual  image  model.  In  final  chapter  the  task  of  image  encoding, 
using  preliminary  image  decomposition  and  wavelet  basis,  is  considered. 
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1.  MATHEMATICAL  IMAGE  MODELS 

All  the  numerous  mathematical  models  for  image  processing  can  be  divided  into  two  classes:  class 
of  continual,  and  class  of  discrete  models  [3].  Continual  models  are  much  more  suitable  for  the  analytical 
research,  they  help  to  perform  theoretical  investigations  and  estimates,  but  actually  they  describe  some 
ideal  objects,  which  we  cannot  operate  with.  Discrete  models  much  more  correspond  to  real  digital 
images,  because  in  a  computer  we  may  contain  and  process  only  discrete  signal.  Such  models  give  us 
direct  way  to  develop  image  processing  algorithms,  but  bring  difficulties  in  theoretical  analysis.  So  nor 
continual  nor  discrete  model  is  unable  to  solve  all  the  problems,  and  we  will  try  to  apply  both  models  for 
our  task. 


1.1.  DISCRETE  IMAGE  MODEL 

1.1.1.  Basic  approaches 

Image  decomposition  may  be  interpreted  as  the  task  of  splitting  an  image  to  the  smoothed  and  the 
difference  components.  The  smoothed  component  should  contain  lengthy  image  areas  with  approximately 
constant  intensity,  separated  by  sharp  edges.  Nonlinear  integral  operators  [8,  9]  and  algorithms,  based  on 
anisotropic  diffusion  models  [10]  are  used  for  the  decision  of  the  problem.  In  our  work  we  will  take  into 
account  the  nonlinear  algorithm. 

Typical  observed  scene  consists  of  a  set  of  objects  of  different  sizes  [1 1],  so  the  image  of  the  scene 
will  include  corresponding  set  of  areas  (domains)  of  smoothly  changed  brightness,  separated  by  sharp 
contour  edges.  Each  area  contains  fine  details  and  texture,  and  often  is  distorted  by  some  noise.  We 
assume  that  captured  image  e  X  may  be  represented  as  an  additive  mixture  of  several  statistically 
independent  components,  each  next  one  is  characterized  by  reduced  scale  parameters.: 

X  =  S  +/■  -l-M  + 
mn  mn  nm  mn  * 

First  component  s„,„  determines  smoothed  values  of  extended  domains  and  describes  extended  image 
details.  The  other's  terms  reflect  information  about  fine  details,  texture,  noise,  etc.  In  a  simply  case  one 
may  limit  the  row  of  components  by  two  [9]: 

^mn  ^iiin  ^mii '  ( 1  •  1 ) 

Here  becomes  to  be  the  sum  of  all  other  information  components. 


Let  an  image  X  consists  of  R  different  domains  C/^,  and  inside  a  domain  V  (Xy  e  U') 

smoothed  component  S  with  appropriate  accuracy  is  represented  by  polynomial  of  power  m.  Then  one 
may  write  the  formula: 


C"'  = 

‘-’ij 

p=0  ^=0 


For  the  value  of  individual  element  we  obtain: 


P=0  q=0 


(xr  €  uy 


(1.2) 


This  is  the  main  formula  of  image  model.  Analysis  of  real  images  demonstrates,  that  inside  unique  region 
we  may  assume  to  be  normally  distributed  N(0,a^),  but  with  different  variation  <7,  because 
statistical  properties  of  texture  component  may  differ  from  a  domain  to  domain. 


Basic  formula  (1.2)  may  be  modified  with  different  G?.  Depending  upon  the  class  of  processed 
images,  tasks  to  be  decided,  required  accuracy,  and  computing  complexity  one  can  choose  appropriate 
polynomial  power  value  G).  Our  experiments  demonstrate,  that  for  the  majority  of  real  airspace  images 
and  typical  window  sizes  (10-30  pixels),  the  objects  inside  the  window  are  usually  specified  by  slowly 
changing  or  approximately  constant  brightness.  Because  of  this  reason,  and  due  to  the  necessity  to 
develop  enough  effective  algorithms  for  image  enhancement,  it  is  demanded  to  restrict  the  power  of 
polynomial  as  G)  =  0. 
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Almost  all  image  models  describe  statistical  correlation  between  neighbor  elements  only  inside  a 
small  vicinity  (3-5  pixels).  However  statistical  dependencies  between  pixels  in  unique  domain  lie  much 
far  -  up  to  the  distances  of  15-20  elements  and  larger  [11].  This  leads  us  to  use  different  models  for  the 
small  vicinity  and  for  the  extended  image  fragment. 


1.1.2.  Model  of  local  fragment 

Let  be  a  fragment,  surrounding  central  point  (m,n),  and  covering  J  domains  U U''. 
Correspondingly  to  (1.2),  and  under  a  =  0,  the  elements  of  fragment  inside  domain  V  are  described  by 
piecewise  constant  model: 


+  (XyeU^) 


(1.3) 


where  is  an  average  brightness  of  part  of  domain  t/'”,  falling  into  the  fragment  This  model  is 
simple,  enough  close  to  the  majority  of  real  images,  and  acceptable  for  the  decision  of  the  following  task 
of  image  decomposition. 


1.1.3.  Model  of  local  vicinity 

Let  us  consider  a  vicinity  of  some  shape  around  image  pixel  containing  R  elements 
xL  ^  Km  (f  =  1,...,R),  and  p'  denotes  the  distance  between  x,„„  and  Draw  a  plane  minimizing 
mean  square  error,  which  will  form  a  two-sided  angle  of  value  Thus  to  each  point  with  its  vicinity 
we  may  assign  a  vector  g^„  with  amplitude  g„„,  =  th  and  rotation  angle  Let  elements  of  vicinity 

differ  from  this  plane  by  random  value  In  such  an  event  one  can  write  following  formula  for  the 
model  of  vicinity,  which  correspond  to  (1.2)  under  (0-\: 

+  0-4) 

where  p„,„  is  the  value  of  the  plane  at  central  point  (m,n),  and  -  the  projection  of  the  vector  g„„  to 
the  vector,  drawn  from  center  of  vicinity  to  point  r  . 


The  task  to  draw  such  a  plane  is  not  difficult  and  similar  question  was  considered  in  [3].  Let  a  and 
P  are  tangents  of  plane  slopes  in  vertical  and  horizontal  directions,  /  and  j  -  relative  coordinates  of  the 
pixel  in  vicinity.  In  such  a  case  the  equation  of  the  plane  will  be  Zy  =  p  +  ai  +  Pj ,  and  formula  (1.4) 
will  be  written  as  X,j  =  p  +  ai  +  Pj  +  y .  Mean  square  error  will  be  given  as 

f{a,p,y)  =  +  Pi  +  p  -  Xyf . 

i  J 

Minimum  of  f{a,P,y)  locates  at  a  point,  where  all  partial  derivatives  equal  to  zero,  so  one  can 
find  values  of  a ,  P ,  and  pAn  practice  the  vicinity  of  3x3  pixels  is  most  frequently  used.  For  such  a 
case: 


1  1 

«  =  g(Z^i.y  P  = 

Knowing  a  and  P  we  can  simply  calculate  parameters  of  vector  g : 
g  =  V  =  arctg(^) . 


(1.5) 


(1.6) 


1.1.4.  Local  average  and  local  median 

In  accordance  to  (1),  the  decision  of  image  decomposition  task  consists  of  defining  the  value  of 
smoothed  component  for  each  image  pixel.  We  will  find  decomposition  algorithm  in  the  class,  based 
on  analysis  of  image  brightness  distribution  for  local  fragment,  rounding  current  processed  elements. 

The  subtask  of  image  decomposition  -  finding  smoothed  image  component  -  is  similar,  but  is  not 
covered  by  local  image  smoothing  task,  performed  or  with  the  helps  of  local  average 

2  n+L 

y mn  /O  r  J 

(ZL+l;  i=m-Lj=n-L 


(1.7) 
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or  with  the  helps  of  local 


median 


ynm  -  {^«m  }  ’  where  med  {W}  is  the  root  of  the  equation: 


med(W) 


(1.8) 


Here  H,^„  =  {h'^ik)}  is  the  fragment  histogram,  i.e.  the  fraction  of  pixels  of  value  k  in  the  fragment 
W  . 

mn 


1.2.  CONTINUAL  IMAGE  MODEL 

1.2.1.  Definition  of  an  image  set 

According  to  the  continual  model,  any  monochromatic,  flat,  and  stationary  image  is  represented  by  a 
bounded  real-valued  function  of  two  variables  f{x,y)‘,  {x,y)  &  D.  A.  real-validity  and  a  boundedness 
are  the  only  restriction  to  / (Xjy)  placed  in  image  modeling.  It  is  obvious  that  a  set  of  such  functions 
turns  out  to  be  too  wide.  In  particular,  it  contains  various  “bad”  functions  to  which  many  mathematical 
operations  are  not  applied.  To  avoid  such  kind  of  problems  it  is  supposed,  that  there  exist  all  the 
properties  of  function,  we  need  for  one  or  another  mathematical  method.  So  we  think  that  the  function 
f  (Xjy)  is  as  good,  as  it  is  necessary.  The  result  of  this  assumption  turns  out  to  be  very  unexpected, 
enough  to  mention  the  difficulties  we  have  under  using  the  Fourier  transform  for  image  processing 
problems. 

Here  we  will  try  to  formalize  the  properties  of  images,  to  underline  which  mathematical  object  we 
are  dealing  with,  under  considering  an  image  a  signal  that  can  be  percepted  by  human  visual  system.  We 
will  regard  the  simplest  image  type  (flat,  stationary,  and  monochromatic)  as  the  electromagnetic  radiation 
distribution  with  the  density  f{x,y) ,  which  is  defined  on  the  bounded  domain  D  e  OXY . 

Earlier  it  was  mentioned  that  function  of  two  real  variables  / (x,y)  is  a  nonnegative,  bounded,  and 
real-valued,  since  it  is  proportional  to  squared  amplitude  of  electromagnetic  wave.  The  second  obvious 
image  property  is  the  integrability  of  f{x,y)  on  DsOXY.  Really,  every  physical  quantity  is  a 
function  of  a  domain,  not  a  point.  Like  an  energy,  for  example.  On  the  other  hand,  a  density  is  a  function 
of  a  point,  since  a  density  is  introduced  as  a  derivative.  So  by  definition  f(x,y)  is  a  integrable  function 
on  D,  and  because  of  its  boundedness  on  closed  bounded  domain,  integrable  function  f{x,y)  is 
continuous  on  D  everywhere  except  a  null  set. 

Now  we  will  obtain  the  third  property  of  an  image,  rose  from  the  physical  model  we  use.  To  stay 
within  the  borders  of  geometrical  optics,  we  have  to  consider  only  plane  electromagnetic  wave  inside 
every  small  domain  of  a  space.  This  means  that  the  amplitude  of  the  wave  is  a  constant  or  low-changing 
function  at  the  distances  compared  with  the  wave  length  X  or  the  resolution  of  a  recorder  [7].  This  means 
that  we  can  consider  the  light  amplitude  and  the  image  f{x,y)  to  be  piecewise  constant  or  piecewise 
monotone  function. 

Thus,  it  is  shown  that  an  image  / {x,y) ;  (x,^)  €  D  comply  with  following  conditions: 

1 .  0  <  f{x,y)<C,  where  C  is  a  constant; 

2.  f  (x,y)  is  a  continuous  on  D ,  everywhere  except  a  null  set; 

Z.  f{x,y)',  (x,y)  €  £>  is  a  piecewise-monotone  function. 

So  a  set  of  images  is  a  set  of  real-valued  functions  of  two  variables  f{x,y)-,  (x,y)  €  D  comply 
with  the  conditions  1-3  above.  These  properties  can  be  used  to  solve  many  image  processing  problems. 
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Furthermore,  it  is  easy  to  proof  that  the  conditions  1-3  are  sufficient  for  f{x,y)  to  be  a  function  of 
bounded  variations. 

In  one-dimensional  case,  if  we  hold  fixed  one  of  the  variables  (X  or  y),  then  f  (x),  x  €  \a,b\  is  a 
function  of  bounded  variation.  Variation  is  defined  by: 

F(/)=  sup  E|/(x4)-/(Xi_i)|,  (1.9) 

«  k=2 

where  a<x^,...,<x„<b  is  an  arbitraiy  system  of  points  on  [  <3,  .  According  to  the  Dirichlet  condition, 

a  function  /(x),  x  €  [a,b\  is  a  function  of  bounded  variation  on  every  final  interval,  on  which  it  is 
bounded  and  has  final  number  of  maximum,  minimum,  and  discontinuity  points  of  the  first  type. 

According  to  conditions  1-3  it  is  easy  to  prove  that  the  function  /(x)  appears  to  be  a  function  of 
bounded  variations  also  for  two-dimensional  case.  It  is  correct  for  the  most  common  definitions  of  many¬ 
dimensional  variations.  Note,  that  there  are  many  determinations  of  variation  of  a  function  of  several 
variables.  The  majority  of  them  rise  to  the  determination  of  one  functional,  which  boundedness 
guarantees  the  existence  of  certain  characteristics  of  the  function.  In  any  case  the  list  of  these 
characteristics  is  too  poor  in  comparison  with  one-dimensional  case.  It  holds  true  for  the  Arzela,  Vitali, 
Tonelly,  and  other  known  variations  [2].  Essentially  different  approach  was  proposed  by  Kronrod, 
published  in  [2],  where  it  was  shown  that  a  function  of  two  variables  must  be  characterized  not  by  one, 
but  by  two  independent  functionals.  Those  functionals  were  determined  as  following: 

00  00 

\^{f)=  w^{f)=  \v^{e,)dt  (l.io) 

^  —00  —00 

where  a  set  is  a  f -level  of  the  function  f{x,^  •,  is  a  number  of  components  of  the  set 

V^{e^  is  a  length  of  the  set  6^  [2].  For  a  continuous  differentiable  function: 

=  \\\grad{f{x,y))\dxdy. 

D 

It  is  known,  that  numerous  attempts  to  create  image  model  basing  on  determination  of  only  one 
functional  cannot  give  us  a  complete  description  of  image  characteristics.  Moreover,  first  component  Wj 
of  two-dimensional  variation  in  contrast  to  other  metrical  characteristics  reflects  topological  properties  of 
an  image. 

The  class  of  functions  of  bounded  variation  is  very  extensive.  In  spite  of  this,  the  functions  of  such 
class  possess  many  good  properties.  They  have  a  total  differential  almost  everywhere,  their  Fourier  series 
converge  to  them  almost  everywhere,  etc.  [2].  In  the  physical  applications,  the  condition  of  the 
boundedness  of  variations  on  every  bounded  domain  means  that  f{x,  is  bounded,  and  high  frequency 
Fourier  components  cannot  affect  to  the  function  intensity.  It  is  interesting  to  note  that  mathematical 
description  of  wave  diffraction  leads  to  the  intensity  distributions,  which  can  be  used  as  an  illustration  of 
the  functions  of  unbounded  variations  [7]. 

1.2.2  Metric  characteristics  on  an  image  set 

In  the  previous  section  we  defined  some  limitations  to  the  images.  To  operate  with  images  as  with 
mathematical  objects,  we  have  to  introduce  also  the  quantitative  characteristics  of  images.  At  the 
beginning  we  have  to  formulate  what  is  wanted  to  measure,  and  how  to  perform  it.  The  first  is  how  to 
compare  images,  i.e.  how  to  metrize  the  image  set.  According  to  the  aim  of  investigation,  it  is  possible  to 
construct  different  definitions  of  a  distance  between  two  images.  However,  an  image  in  our  interpretation 
is  an  information  for  visual  perception.  So  in  ideal  case,  a  distance  between  two  images  must  be  measured 
correspondingly  to  some  metric,  being  adequate  to  a  human  vision. 

In  spite  of  intensive  study  of  human  visual  system,  any  satisfactory  metric  coordinated  with 
subjective  estimation  it  is  not  suggested.  Moreover,  the  existing  analogs  of  visual  metrics  [1]  are  too 
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complicated,  or  there  are  not  metrics  in  the  mathematical  sense.  Because  of  that,  known  visual  metrics 
can  not  be  used  in  optimization  methods.  Thus,  the  integral  metric  is  most  often  used: 

/>(/,g)  =  [J|/W-g(z)l'’*l'".  (1.11) 

D 

It  is  most  simple  to  use  (1 .1 1)  under  p  =  2,  because  only  in  this  case  we  can  construct  linear  optimization 
algorithms. 


In  image  processing,  especially  in  decision  of  ill-posed  problems,  the  norms  of  functions  are  used  as 
the  Tikhonov  stabilization  functional  [4,  5,  6].  Different  norms  of  functions  can  be  used  as  a  measure  of 
image  smoothness.  As  it  follows  from  the  definition,  a  norm  measures  the  distance  to  zero,  that  is  a 
distance  to  the  most  smoothing  function.  The  quadratic  forms  are  usually  used  to  avoid  the  nonlinearity. 

For  example,  the  set  of  images  is  supposed  to  be  the  Sobolev  space  Wj,  (Z))  with  respect  to  the  norm: 


)  +^2(^) 


In  discrete  approximation,  the  usage  of  a  Gaussian  image  model  leads  to  the  minimization  of  the 
following  quadratic  forms: 

‘  (1.13) 


where  C  is  a  covariance  operator  of  f{x,y),  is  a  mean  value.  The  norm  (1.13)  corresponds  to  a 
Sobolev  norm  (1.12)  under  certain  conditions  [5].  It  is  necessary  to  note  that  in  some  cases  the  usage  of 
quadratic  forms  in  the  extreme  problems  can  lead  to  the  undesirable  results. 


Besides  norms,  all  measures  of  smoothness,  which  are  defined  on  the  functional  set,  can  be  used  for 
the  decision  of  decomposition  problem.  The  most  popular  is  the  probability  measure  p{z},  which  is 
introduced  on  a  set  of  discrete  images  [1].  Considering  an  image  two-dimensional  geometric  surface 
z  =  f{x,y),  one  can  use  any  of  the  geometric  measures  or  other  metric  characteristics,  for  example, 
entropy  and  variations.  In  particular,  the  Hausdorff  measures  were  efficiently  adapted  to  the  fractal 
theory.  It  is  known  that  the  variations  of  a  set  give  more  adequate  description  of  its  geometrical  structure 
in  comparison  with  measures  and  entropy  [2].  For  the  estimation  of  the  geometric  “complexity”  of  an 
image  the  variations  should  be  used.  Correspondingly  to  real  images  the  complexity  reflects 
“unsmoothness”  of  an  image.  So,  increasing  image’s  complexity  we  reduce  its  smoothness,  and  vice 
versa.  Very  often  we  use  such  a  measure  as  a  power  of  a  finite  set,  for  example  when  an  image’s 
histogram  is  estimated.  A  truth  value  is  applied  as  a  measure  in  some  special  cases,  when  we  must  say 
“yes”  or  “  no”  using  video  data. 


The  main  question  is  the  following:  which  metric  characteristic  of  the  whole  variety  existing  in 
modem  mathematics  to  choose.  In  other  words,  there  is  a  need  to  estimate  the  quality  of  metric 
characteristics,  which  is  defined  as  the  usefulness  of  the  measures  for  a  given  purpose  and  a  given  user. 
Unfortunately,  except  some  general  reasoning  we  can  propose  only  the  numerical  experiments  to  answer 
this  question.  In  the  next  section  we  describe  the  scheme  that  can  be  used  in  numerical  experiments  to 
compare  the  effectiveness  of  the  usage  of  different  metric  characteristics  . 
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2.  IMAGE  DECOMPOSITION  AND  ENHANCEMENT 

2.1.  IMAGE  DECOMPOSITION 

The  image  smoothing  operation  may  be  treated  as  some  transformation,  which  lefts  on  an  image  only 
lengthy  details,  and  avoid  small  details,  noise,  texture,  etc.  To  understand  this  process  better,  we  interpret 
image  as  a  combination  of  statistically  and  semantically  independent  components,  each  of  them 
containing  details  of  different  information  classes.  Statistic  distinctions  of  them  allow  to  find  an 
algorithm  for  extracting  from  an  image  only  the  component  of  interest.  Such  splitting  an  image  to  the 
components  we  call  image  decomposition. 

We  will  find  rank  algorithm  to  decompose  an  image  to  two  components:  smoothed  one,  contained 
lengthy  details  with  sharp  edges,  and  texture  one,  contained  fine  details,  texture,  and  noise.  These 
components  may  be  used  both  separately  for  the  consequent  analysis  (for  example:  image  smoothing, 
object  extraction,  texture  analysis,  etc.),  and  together  for  the  image  enhancement. 

According  to  (1.1),  the  decision  of  image  decomposition  task  consists  of  finding  the  value  of 
smoothed  component  for  each  image  pixel.  We  will  find  decomposition  algorithm  in  the  class,  based 
on  analysis  of  image  brightness  distribution  for  local  fragment,  rounding  current  processed  elements. 
According  to  piecewise  image  model  [12],  the  probability  density  of  brightness  P{x}  for  the  fragment 
containing  R  domains  (/',... ,C/*,  has  R  local  maxima  (see  Fig.l).  Their  positions  coincide  with 
the  values  of  mean  brightness’  of  each  domain:  We  assume,  that  each  image  pixel  belongs  to 

some  domain  r  .  Also  is  the  mean  value  for  the  part  of  domain  r ,  containing  this  pixel,  and  placed 

inside  the  fragment  So,  in  a  distribution  P{x}  we  have  to  find  the  peak,  corresponding  to  the 

domain  rounding  central  pixel,  and  then  determine  its  local  maximum.  The  problem  is  that  exact  shape  of 
distribution  P{x}  is  unknown,  and  we  only  can  arrange  it  by  the  sample  probability  -  the  local  histogram 

H^„  =  {h{k)} ,  reflecting  the  fraction  of  pixels  of  value  k ,  been  contained  inside  the  fragment  . 


Fig.l.  An  image  local  fragment  containing  three  domains  ((/',  U^,  U\  and  its  histogram  h{k). 


We  will  find  out  image  decomposition  algorithm  among  rank  ones,  based  on  order  statistics  of  the 
fragment.  Most  known  and  simple  of  them  is  local  median  (1.8),  but  performed  analysis  [12] 
demonstrates,  that  both  local  median  and  local  average  (1.7)  give  insufficient  decision  of  the  above  task. 
Much  better  are  algorithms,  based  on  M-estimates  and  using  order  statistics  [13].  This  type  of  estimates  is 
more  adaptive  to  signal  variability,  and  consists  of  the  following.  Let  some  weighted  function  w,  (x), 
(/=!,...,«)  is  defined  for  the  ordered  set  of  elements  {x}  =  x,  <  Xj  <,...,<  x„  (x,  €  ).  M-estimate 

for  this  set  will  be: 
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M(fF)  =  w,.  (X;  )/iw,  (x,. ) . 

I=l  f  /=1 

If  for  any  i  we  choose  w,.(x)  =  1  if  -^  <  x  <  ^ ,  and  w^(x)  =  0  for  other  x ,  then  we  obtain  sigma 
filter  Lee  [14].  The  value  of  central  pixel  of  the  fragment  x„,„  is  chosen  as  a  zero-point: 

n  j  n 

yn,n  =  (2-1) 

/=1  /  1=1 

This  filter  was  proposed  for  image  smoothing,  and  it  was  recommended  to  choose  5  =  2(7,  where 

<7^  is  noise  variance.  In  the  survey  [8]  the  sigma  filter  was  nominated  as  one  of  the  best  edge-preserving 
noise  cleaning  filter  for  fragments  of  5-7  elements'  width.  The  weak  place  of  sigma  filter  is  the  choice  of 
zero  point  for  weighted  function  w,  (x) .  The  value  of  central  pixel  of  the  fragment,  as  it  was  proposed  in 
[14],  is  simple  but  too  inaccurate  estimate  for  noisy  images.  It  is  enough  probable,  that  the  difference 
jc,  -x„,„  is  so  big,  that  desirable  peak  is  out  of  the  range  -5  <x„,„  <5.  So  our  estimate  proved  to  be 
incorrect,  will  lose  the  accuracy,  and  filtered  image  will  be  blurred  (this  effect  has  more  high  influence 
near  contour  edges). 


10 


Fig.2.  Comparison  of  image  smoothing  algorithms.  Source  airphoto  image  (a);  its  local  average  (b),  local  median 
(c),  and  smoothed  component  after  the  image  decomposition  (d). 

Basing  on  the  model  (1.3)  and  on  the  representation  of  the  fragment  as  a  set  of  domains,  we  propose 
iterative  algorithm,  containing  sigma  filter  as  a  subclass.  The  idea  consists  of  iterative  analysis  of 
embedded  fragments  of  different  sizes.  The  smaller  is  the  fragment,  the  less  is  the  number  of  domains 
being  covered,  and  hence  the  less  is  the  influence  of  redundunt  domains.  Of  course,  the  accuracy  of  the 
estimate  will  also  be  reduces  correspondingly  to  smaller  window  size.  Value,  obtained  for  the  smaller 
fragment,  is  used  as  a  null-point  for  larger  one.  Thereby  following  iterative  algorithm  may  be  formulated. 

On  each  iteration  q ,  estimate  is  calculated  over  corresponding  fragment  .  The  value  y’”' , 
obtained  on  the  previous  iteration  over  the  fragment  is  used  as  current  zero-point  of  weighted 

function.  For  the  first  iteration  the  value  of  central  pixel  is  used.  The  value,  obtained  on  the  last  step 
Q,  is  assumed  to  be  the  resultant  value  of  smoothed  component  at  a  point  (/«,«) : 

;  yl„  =  T.xwix  -  )/'Zw(x~  yll ) .  (2.2) 

Experiments  demonstrate,  that  two  iterations  is  usually  enough:  first  one  for  the  vicinity  of  3-5  pixel 
width,  and  second  one  for  the  fragment  of  1 5-25  pixel  width.  This  algorithm  has  enough  high  accuracy  in 
estimating  of  smoothed  component  on  noisy  images:  processing  of  artificially  composed  signals  (a  set  of 
flat  objects  with  5%  gaussian  noise)  demonstrates  mean  square  error  near  0.3-0.5%.  Visual  estimates  of 
images  after  the  decomposition  also  confirm  the  efficiency  of  the  algorithm.  Qualitative  estimate  of 
images'  decomposition  with  the  helps  of  two-dimensional  variation  is  discussed  in  the  following  chapter. 

The  example  of  smoothing  some  airphoto  image  with  the  helps  of  local  average  (1.7),  local  median 
(1.8),  and  proposed  algorithm  of  image  decomposition  (2.2),  is  represented  in  Fig.2.  One  can  see 
significant  differences  of  contour  sharpness  in  pictures  (b),(c),  and  (d).  These  results  demonstrate,  that 
image  decomposition  algorithm  (2.2)  can  successfully  extract  smoothed  component  from  the  image. 


Fig.3.  Example  of  image  decomposition:  a  -  source  image  (Mars  surface);  b  -  smoothed  component 
after  the  decomposition. 


2.2.  IMAGE  DECOMPOSITION  AND  TWO-DIMENSIONAL  VARIATION 


In  all  image  smoothing  algorithms  the  shape  of  operator  and  the  power  of  smoothness  is  defined  by 
some  parameters.  Most  often  the  parameters  are  chosen  empirically  with  visual  estimate  of  a  result,  but  in 
many  cases  they  may  be  found  as  a  solution  of  some  optimization  problem,  so  we  need  to  measure  the 
properties  of  images  we  are  interested  in.  In  our  case  such  estimate  is  the  smoothness  of  an  image.  Above 
we  discussed  formal  possibilities  of  application  of  two-dimensional  variation  (1.10)  for  image  estimation. 
Nevertheless  one  question  remains  unknown:  does  this  estimate  can  reflect  the  complexity  and 
smoothness  of  images  better  than  other  metrical  characteristics,  based  on  only  one  functional?  To  answer 
the  question  the  set  of  experiments  in  smoothing  some  test  images  was  performed. 

For  the  numerical  estimate  and  comparison  of  proposed  image  decomposition  with  traditional  image 
smoothing,  based  on  local  average  and  local  median,  both  artificial  and  real  images  were  used.  These 
images  are  presented  on  Fig.3.  Four  images  were  used  for  the  testing:  two  real  ones  (Landscape  and 
Portrait),  and  two  artificially  generated  in  following  manner.  A  priory  smooth  artificial  image  (piecewise 
constant  signal  -  a  set  of  domains  of  different  sizes  and  brightness)  was  dusted  initially  by  2%  gaussian 
noise  (Test  Image  1).  Second  image  was  dusted  by  a  mixture  of  2%  gaussian  and  2%  uniformly 
distributed  pepper-and-salt  noise  to  make  a  noise  with  “heavy  tails”  (Test  Image  2).  These  images  were 
smoothed  by  image  decomposition  algorithm  (8)  with  different  outer  windows  and  thresholding 
parameters.  Well  known  moving  average  and  moving  median  algorithms  were  used  for  the  comparison. 
Two-dimensional  variations  (1.10),  mean  square  deviations  from  the  original  signal,  entropy,  and 
differential  entropy  were  measured.  These  data  are  reduced  into  the  Table  1  and  are  drawn  as  a  set  of 
plots  in  Fig.3  and  4. 

For  the  beginning  it  is  interesting  to  compare  image  decomposition  with  local  average  and  local 
median.  Mean  square  error  on  Fig.3. b  and  3.e  shows,  that  image  decomposition  restores  original  image 
enough  precisely,  whereas  local  average  and  median  significantly  distort  an  image  (Fig.3.b).  The 
smoothness  of  images  after  the  transformations  may  be  reflected  by  difference  entropy  (Fig.3  .c);  one  can 
see,  that  decomposition  produces  more  smoothed  image.  Along  with  that,  the  first  component  of  two- 
dimensional  variation  (Fig.3. a)  reflects,  that  both  average  and  mean  produces  images  without  details 
(corresponding  plots  are  tending  to  zero),  whereas  decomposition  plot  tends  to  some  non-zero  constant. 
Plots  on  Fig.3.d  and  3.f  demonstrate,  that  choosing  of  thresholding  parameters  helps  to  make  different 
smoothed  and  detailed  images. 
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Flg.3.  Test  images:  a  -  source  image;  b  -  test  image  2  (cprrupted  by  gaussian  and  uniformly  distributed 
noises);  c  and  d  -  smoothed  component  after  the  decomposition  with  different  area  thresholding. 

Comparing  the  estimates  of  the  first  and  the  second  components  of  two-dimensional  variation  one 
can  see,  that  their  behavior  both  on  test  and  on  real  images  looks  similar.  With  increasing  the  size  of  outer 
smoothing  window  first  component  significantly  reduces  to  some  constant,  while  second  component 
remains  approximately  the  same  (see  Fig.4).  The  fact,  that  first  component  approximates  to  some 
constant  (the  plot  has  some  “plateau”)  may  help  to  choose  the  parameters  of  decomposition  algorithm.  It 
looks  reasonable  to  recommend  to  take  the  parameters,  corresponding  the  beginning  of  such  plateau. 

Perhaps  it  is  early  to  draw  any  conclusions  about  global  advantage  of  introduced  two-dimensional 
variation,  but  in  any  case  it  looks  perspective  and  demanding  the  continuation  of  researches. 

Table  1.  Values  of  two-dimensional  variation:  IV/  is  the  first,  and  IV2  is  the  second  component. 


Smoothed  component  (WI)  Texture  component  (Wl) 


Image 


Test  Image  1 


Test  Image  2 


Landscape 


Portrait 


133.6  1966 


252.9  3347 


Source  image 

Dusted  image 

WI 

W2 

Wl 

W2 

MSE 

6.74 

695.2 

463.7 

2873 

5 

6.74 

695.2 

795.3 

4203 

13.6 

25 

5 

9 

15 

25 

8.7 

467 

467 

465 

466 

16.1 

721 

728 

738 

747 

12.3 

158 

161 

164 

166 

18.1 

365 

366 

365 

362 

In  the  Tables  2  and  3  the  following  notation  is  used:  Fi  and  Fo  are  sizes  of  inner  and  outer 
fragments;  Ti  and  To  are  thresholding  limits  for  inner  and  outer  fragments;  C  is  type  of  component:  s  - 
smoothed,  t  -  texture  (difference);  Wj  and  W2  are  the  first  and  the  second  components  of  two- 
dimensional  variations;  Mean  and  Var  are  mean  and  variance;  Entr  and  DifEntr  -  entropy  and  difference 
entropy. 
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Ti 


Source  image 


Median 


Average 


1 

3 

1 

7 

1 

11 

1 

15 

1 

20 

1 

■1 

1 

15 

1 

25 

Median 

Average 

1 

3 

1 

■a 

1 

11 

1 

15 

1 

20 

1 

7 

'  1 

15 

1 

25 

Image2  (Ai 

Ti 

To 

W] 

W2 

Mean 

Var 

Entr 

DifEntr 

133.56 

1966.33 

93.17 

33.38 

6.913 

4.749 

9.47 

551.02 

91.89 

30.49 

6.611 

2.930 

9.73 

577.15 

93.17 

30.86 

6.684 

3.057 

35.56 

940.44 

92.92 

32.05 

6.742 

3.308 

27.27 

852.49 

92.69 

31.72 

6.724 

3.241 

21.30 

759.17 

92.43 

31.33 

6.697 

3.157 

16.96 

679.40” 

92.21 

30.99 

6.672 

3.071 

13.02 

599.22 

91.99 

30.63 

6.646 

2.975 

29.91 

818.01 

92.73 

31.56 

6.704 

3.091 

21.33 

710.15 

92.46 

31.12 

6.675 

2.985 

14.96 

598.68 

92.17 

30.61 

6.636 

2.855 

167.69 

1832.59 

129.29 

11.05 

4.779 

4.641 

162.47 

1918.86 

128.00 

11.04 

5.015 

4.712 

157.65 

1429.38 

128.26 

6.03 

4.477 

4.372 

160.24 

1497.04 

128.48 

6.84 

4.534 

4.429 

162.13 

1584.20 

128.74 

7.92 

4.602 

4.491 

164.31 

1669.18 

128.96 

8.96 

4.669 

4.546 

165.55 

1767.51 

129.96 

10.20 

4.759 

4.608 

147.43 

1451.95 

128.44 

6.94 

4.647 

4.391 

150.04 

1540.09 

128.71 

8.10 

4.716 

4.456 

151.33 

1641.45 

129.00 

9.49 

4.799 

4.524 

9 

9 

3 

9 

3 

9 

3 

9 

3 

9 

3 

9 

3 

9 

3 

9 

3 

9 

9 

9 

3 

9 

3 

9 

3 

9 

3 

9 

3 

9 

3 

9 

3 

9 

3 

9 

Median 


Average 


9 


15 


1  25 


5 


9 


15 


25 


Median 


Average 


Wi 


252.94 


10.25 


9.49 


24.38 


22.67 


20.43 


18.09 


21.46 


20.45 


18.46 


16.70 


348.43 


320.99 


15 

t 

365.85 

25 

t 

362.02 

5 

t 

386.36 

9 

t 

386.09 

15 

t 

381.78 

25 

t 

373.29 

W2 


1145.58 


1080.68 


1658.67 


1639.41 


1579.23 


1432.29 


1590.45 


1581.02 


1537.60 


1408.51 


3000.87 


3170.16 


2416.84 


2434.32 


2487.20 


2639.31 


2557.07 


Mean 


73.70 


73.22 


73.70 


73.59 


73.57 


73.52 


73.42 


73.45 


73.45 


73.41 


73.34 


128.48 


128.00 


128.11 


128.13 


128.18 


128.28 


128.25 


2588.33 


2706.31 


128.29 


128.36 


Var 

Entr 

DifEntr 

42.91 

6.422 

5.041 

40.24 

6.283 

3.175 

38.43 

7.085 

3.445 

41.83 

7.172 

3.647 

41.79 

7.171 

3.634 

41.67 

7.167 

3.604- 

41.34 

7.155 

3.513 

41.71 

7.168 

3.610 

41.69 

7.168 

3.602 

41.60 

7.164 

3.579 

41.30 

7.154 

3.497 

12.06 

5.087 

4.921 

13.71 

5.490 

5.003 

6.09 

4.605 

4.676 

6.21 

4.619 

4.686 

6.67 

4.667 

4.714 

8.22 

4.808 

4.792 

6.65 

4.676 

4.751 

6.72 

4.681 

4.754 

7.03 

4.713 

4.767 

8.41 

4.835 

4.824 

7 


9 


11 


13  15  17  19  21  23  25 


5  7  9  11  13  15  17  19  21  23  25 


a.  First  component  of  variation  d.  First  component  of  variation  (various  thresholdings) 

(various  smoothings) 
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b.  Mean  square  error  (various  smoothings)  ^  Mean  square  error  (various  thresholdings) 


c.  Entropy  and  differential  entropy  f  Entropy  and  differential  entropy 

(various  smoothings)  (various  thresholdings) 


Fig.3.  Dependencies  of  various  estimates  on  the  size  of  smoothing  window  (on  X-axis):  a  and  d  -  the  first 
component  of  two-dimensional  variations  (Wl),  b  and  e  -  the  mean  square  error  (MSE),  c  and  f-  the  entropy  (Entr) 
and  the  differential  entropy  (DifEntr);  a,  b,  and  c  -  various  smoothing  windows  (Test  image  1):  decomposition 
(Dec),  median  (Med)  and  average  (Aver);  d,  e,  and  f  -  image  decomposition  under  various  thresholding  (1  or  4)  on 
outer  window  (Test  image  2). 
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a.  First  variation  b.  Second  variation 

(Test  Image  1,  texture  component)  (Test  Image  1,  texture  component) 
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c.  First  variation  on  real  images  d  Second  variation  on  real  images 

(smoothed  component)  (smoothed  component) 


Fig.4.  The  behavior  of  two-dimensional  variation  on  texture  component  and  on  real  images  after  the  decomposition; 
plots  a  and  b  shows  the  first  and  the  second  component  of  Test  Image  1;  plots  c  and  d  shows  the  first  and  the  second 
component  of  real  images  in  comparison  with  Test  Image  1 . 

2.3.  IMAGE  ENHANCEMENT  ON  THE  BASIS  OF  IMAGE  DECOMPOSITION 

Image  enhancement  is  intended  for  rising  of  visual  image  quality  before  the  following  analysis. 
Rozenfeld  [15]  interpret  image  quality  as  spatial  resolution  and  sharpness.  Avoiding  problem  of 
increasing  of  resolution,  and  not  taking  into  account  independent  problem  of  color  reproduction,  one  may 
say,  that  visual  image  quality  is  dictated  by  the  contrast  and  the  sharpness  of  its  details.  Traditionally  the 
problem  of  image  enhancement  may  be  subdivided  into  two  parts:  the  gray  scale  correction,  globally 
affecting  to  the  entire  image,  and  the  image  sharpening,  being  attained  by  local  contrasting  [1].  In  the 
present  paper  we  consider  the  second  one,  and  treat  it  as  magnification  of  local  contrasts. 

There  are  known  many  methods  for  image  enhancement.  Some  reviews  may  be  found  in  [1,8,9]. 
Also  there  is  demonstrated,  that  better  results  are  given  by  methods,  adaptable  to  statistical  parameters  for 
local  window.  Most  of  them  use  local  mean  and  variance,  or  similar  estimates,  like  median  and 
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interquartile  distance  [16].  The  overwhelming  majority  of  used  methods  may  be  generalized  by  some 
common  filtering  formula,  closed  to  well-known  unsharped  masking.  Detailed  discussion  will  be  done 
below,  here  we  will  only  notify,  that  most  important  element  of  these  methods  is  obtaining  of  well- 
smoothed  image. 

Image  enhancement  is  based  on  human  perception  and  visual  estimation  of  an  image,  and  is  closely 
connected  with  extracting  of  video  information  from  an  image  for  the  following  analysis  and 
interpretation.  Because  of  the  absence  of  formal  image  quality  criterion,  there  is  no  global  image 
enhancement  theory,  and  it  is  usually  treated  as  a  raising  of  visibility  of  image  details.  This  is  attained  by 
magnification  of  contrasts  (both  global  and  local)  on  an  image.  We  consider  here  only  the  task  of 
magnification  of  local  contrasts. 

Many  methods  for  image  enhancement  are  published.  The  comparison  of  used  methods  [1,8,12] 
demonstrates,  that  better  results  are  given  by  methods,  adaptable  to  statistical  parameters  for  local 
window.  Methods,  analogous  to  well-known  unsharp  masking  are  most  frequently  used.  In  a  simple  form 
they  may  be  described  by  the  formula: 

where  is  or  local  average  or  local  median;  K  is  coefficient  of  magnification  of  local  contrasts,  b  and 
c  are  parameters  of  transformation  (usually  constants).  By  choosing  K  and  6  we  may  significantly 
change  the  properties  of  the  filter.  For  example,  if  is  local  average,  K  =  0,  and  6  =  1,  we  obtain  low- 
pass  filter  (image  smoothing);  if  1,  and  6  =  0,  we  obtain  high-pass  filter;  if  AT  >  1,  and  b  =  \,-  local 
contrasts'  magnification. 

In  a  widely  referenced  paper  [17],  Wallis  proposed  a  statistical  operator  for  generalizes  unsharp 
masking  and  statistical  differencing  [15]  by  forcing  a  desired  mean  and  standard  deviation: 
acTj  _ 

y'x”  ^  ^ (2-3) 

Here  and  <t„,„  are  local  mean  and  variance  of  the  source  image;  and  <7^  are  desired  global  mean 
and  variance  of  processed  image;  a  and  are  local  contrast  and  brightness  forcing  factors.  Analogous 
filtering  algorithm  was  proposed  in  [16],  where  local  median  and  interquartile  distance  were  used  instead 
of  local  mean  and  variance. 

Some  analysis  and  classification  of  different  enhancement  algorithms  were  done  in  [12],  where  they 
were  reduced  to  closed  form,  generalized  in  the  following  manner: 

ym«  =  /(^»,»  -  s„,„  J  (2-4) 

Parameters,  being  presented  in  the  formula,  are  related  to  following  image  properties:  -  estimate  of 

image  brightness  inside  local  window,  rounding  processed  element  (local  mean,  median,  or  similar);  <7„,„ 
-  estimate  of  image  variability  inside  the  window;  f{u,v)  -  some  monotone  function  of  U,  depending  on 
parameter  v . 

Formula  (2.4)  describes  enough  wide  class  of  transformations,  definable  by  the  choice  of  estimate 
and  function  /(m,v)  .  Of  course,  estimate  may  be  not  only  local  average  or  median.  Looking  to 

the  formula  it  is  evident,  that  is  very  similar  in  sense  to  in  (1.3).  Such  approach  to  image 
enhancement  means  nothing  more  than  separating  of  source  signal  to  two  quite  different  components,  and 
following  transformation  over  them.  So,  we  may  substitute  smoothed  j’,„„  and  detailed  components, 
obtained  after  decomposition,  into  (2.4)  instead  of  and 

Image  enhancement  basing  on  the  decomposition  (2.2)  mainly  demonstrates  good  experimental 
results.  Nevertheless  false  contours  sometimes  rise  on  an  image  near  not  very  sharp  edges.  The 
explanation  of  this  artifact  was  done  in  [9]  -  the  effect  appears  due  the  inaccuracy  of  piecewise  constant 
model  (1.3)  for  the  non-sharp  contour  edges.  Decomposition  algorithm  smoothes  an  image  so,  that  it 
becomes  to  be  even  more  sharp  on  contour  edges,  then  source  one.  Hence,  on  such  places  the  difference 
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^mn  ~  ^mn  8®^  oppositc  sign  in  comparison  with  x^„  —  (element  value  minus  local  average  or 
local  median),  and  magnification  of  this  difference  induces  the  appearance  of  false  contours.  To  avoid 
this  lack  it  is  enough  to  perform  small  linear  averaging  of  smoothed  component  Usually  averaging 
inside  the  vicinity  3x3  is  enough.  Experiments  on  such  enhancing  some  airphoto  images  basing  on  image 
decomposition  demonstrate  high  quality  of  obtained  results. 


Fig.4.  Enhancement  of  Landsat  image  (a)  using  local  average  (b),  and  smoothed  component  after  image 
decomposition  (c). 


An  example  pf  enhancing  some  space  image  is  demonstrated  on  Fig.4.  One  can  see,  that  after  image 
enhancement  with  local  average  (image  Fig.4,b)  small  details  and  contours  of  expanded  areas  are 
overcontrasted,  whereas  after  preliminary  image  decomposition  (Fig.4,c)  these  lacks  are  avoided,  and 
objects'  brightness  remain  unchanged. 


2.4.  AUTOMATIC  GRAYSCALE  IMAGE  CORRECTION 

Automatic  gray  scale  correction  of  captured  video  data  (both  still  and  moving  images)  is  one  of  the 
least  researched  questions  in  the  image  processing  area,  in  spite  of  this  question  is  touched  almost  in 
every  book,  concerned  with  image  processing.  Classically  it  is  related  to  the  image  enhancement 
[1,18,19],  and  frequently  is  classified  as  histogram  modification  techniques.  Traditionally  used 
algorithms,  based  on  analysis  of  image  histogram  [20-23],  are  not  able  to  decide  the  problem  properly. 
The  investigating  difficulties  are  associated  with  the  absence  of  formal  quantitative  estimate  of  image 
quality  -  till  now  the  most  often  used  criteria  remain  the  human  visual  perception  and  experience.  As  a 
hence,  is  still  unsolved  the  problem  of  finding  out  some  measurable  properties  of  real  images,  which 
might  be  the  basis  for  automatic  building  of  gray  scale  correction  function  (sometimes  been  identified 
also  as  gamma-correction  function). 

Below  we  will  try  to  discern  some  common  properties  of  real  images,  that  could  help  us  to  evaluate 
the  gray  scale  image  distortion,  and  finally  to  construct  the  appropriate  correction  function  to  enhance  an 
image.  Such  a  method  might  be  sufficiently  used  for  automatic  image  processing  procedures,  like 
enhancing  of  medical  images,  reproducing  of  pictures  in  publishing  industry,  correcting  of  remote 
sensing  images,  preprocessing  of  captured  data  in  computer  vision  area,  and  for  many  other's 
applications.  The  question  of  complexity  of  analysis  procedure  becomes  to  be  important  when  an 
algorithm  is  realized  in  real-time  (for  example  in  video  input  devices,  like  video  cameras). 


24.1.  Statement  of  a  task 
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The  goal  is  to  propose  some  methods  for  automatic  image  enhancement  by  use  of  gray  scale 
correction.  We  guess,  that  the  only  accessible  initial  information  is  already  distorted  captured  image 

e  X ,  and  formal  representation  of  such  gray  scale  image  distortion  is  the  following; 

^mn  ~  f^ymn^-  (2-5) 

Here  ym„  and  are  pixels'  values  correspondingly  of  non-distorted  original  and  of  registered  images, 
and  is  the  distortion  function.  Assuming  that  f  (z)  is  monotone  and  univalent,  there  exists  an 

inverse  function  F(x),  which  may  be  used  to  reconstruct  source  image.  Its  formal  representation  is  also 
extremely  simple: 

y/nn  ~  (2-6) 

where  F{x)  is  gray  scale  correction  function,  destined  to  compensate  the  characteristics  of  video 
capture  and  storage  devices,  and  to  enhance  the  received  video  data.  Nevertheless  the  simplicity  of  the 
formula  not  reflects  the  simplicity  of  the  task. 

The  central  problem  here  is  the  question  of  how  to  find  out  the  function  .  In  most  cases  there 
is  no  any  usable  information  other  then  already  distorted  video  data,  and  the  direct  image  distortion 
function  usually  also  remains  unknown.  Thus,  the  only  way  to  find  out  the  function  F{x)  becomes  the 
analysis  of  the  source  image.  So,  the  main  question  is  the  criterion  of  estimating  the  image  gray  scale 
distortion. 

Almost  all  known  methods  for  gray  scale  image  correction  use  only  a  distribution  of  pixel  values 
(image  histogram),  and  not  take  into  account  statistical  correlation  between  neighbor  pixels,  though  it  is 
obvious,  that  gray  scale  distortions  will  change  them  too.  Below  we  will  demonstrate  the  possibility  to 
construct  gra^  scale  correction  algorithm  on  a  basis  of  these  statistical  properties.  For  more  deeply 
discussion  of  this  question  we  must  base  on  some  image  model,  been  adequate  for  our  task. 

2.4.2.  Discussing  image  model 

Real  images  are  characterized  by  presence  of  a  set  of  different  size  areas  (details  of  a  scene)  with 
approximately  constant  or  weakly  varying  brightness  inside,  been  separated  by  contour  borders.  Very 
often  an  image  is  treated  as  some  two-dimensional  surface,  been  traced  (with  predetermined  accuracy) 
across  the  values  of  image  pixels  in  a  coordinate  plane  (m,n)  [1,24,25].  Lengthy  image  details  are 
represented  here  as  smoothed  and  approximately  horizontal  areas,  and  contours  -  as  narrow  strips  with 
steep  slope  angle  of  the  surface. 

A  suitable  image  model  was  proposed  and  researched  above.  Correspondingly  to  the  model  (1.4),  for 
every  image  pixel  there  may  be  selected  some  closed  vicinity  Vmn  (possibly  of  changeable  shape) 

containing  the  set  of  neighbor  pixels  x'^^j  ^Knn'>  ^  belonging  to  the  same  image  object  or 

contour,  as  x^^  pixel.  An  appropriate  plane  may  be  drawn  across  the  elements  of  the  vicinity,  which  will 
be  sloped  relatively  to  the  horizontal  plane  for  the  two-sided  angle  ■  Thus  to  each  of  the  image 
points  there  may  be  assigned  some  vector  g;„„  with  its  amplitude  g^n  ~  ^S¥mn  and  twist  angle  . 
Correspondingly  to  formula  (1.4),  is  the  value  of  drawn  plane  in  the  central  point  of  vicinity 
(m,w),  in  other  words  the  average  value;  p  is  Euclidean  distance  between  the  central  point  and  the 

point  at  position  V ;  and  Ymn  is  some  random  value,  been  defined  by  the  granularity  of  registering 
materials  and  by  the  noise  of  input  device.  The  value  gmn  defines  the  local  contrast  inside  the  vicinity. 

Calculate  the  dependence  of  the  value  g  on  the  signal  brightness  by  averaging  it  for  a  set  of 

vicinities  ,  having  the  same  average  values  =  k ,  and  let  X(vi^ )  is  the  number  of  such  vicinities 
on  an  image: 
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c{k)  = 


1 


ILSn 


(2.7) 


This  function  c(it)  we  shall  call  the  Local  Contrast  Function.  In  [24],  where  it  was  demonstrated,  that 
local  contrast  function  c(A)  for  the  whole  image  may  be  well  approximated  as  a  sum  of  independent 
distributions  for  two  different  sets  of  image  pixels;  contour  element  pixels,  and  internal  flat  areas'  pixels. 
Each  of  them  may  be  well  represented  by  appropriate  normal  distribution,  where  the  variance  for  contour 
pixels  is  5-20  times  more  than  the  variance  for  flat  areas. 


2.4.3.  Hypothesis  about  local  contrast  function 

Substitute  the  formula  (1.4)  to  (2.5).  Average  value  ftmn  will  change  to  and  component 

P'^  Smn  to  if'iPmn  )  +  ^^f'iPmn  )))p'^  Smn ,  where  f’{x)  is  the  derivative  of  f{x).  The  component 
Ymn  will  not  change,  because  we  suppose  that  it  was  added  after  the  gray  scale  distortion.  Assuming  that 


o^f\Pmn  ))  is  negligible  small,  we  obtain: 

^mn  ~  f  ^Pmn  )  ^  ^Pmn  )P  Smn  Y mn-  (2-8) 

For  us  it  is  important,  that  the  tangent  of  slope  angle  of  the  plane,  been  drawn  through  the  elements  of 
vicinity,  will  change  to  the  following  value: 

g  =  f'iPQ)gQ.  (2.9) 

Substituting  (2.9)  to  (2.7)  for  the  image  with  initial  local  contrast  function  Cq  (^)  we  obtain: 

c(/(^))  =  /'(^)co(^).  (2.10) 

Since  we  suggest  that  for  / (k)  there  is  an  inverse  function  F(k),  we  may  write  the  inverse  equation: 

c'^{F{k))  =  F\k)c{k).  (2.11) 


From  this  equation  it  is  evident,  that  for  determining  the  gray  scale  correction  function  F(k)  we  need  to 
estimate  the  present  local  contrasting  function  c{k)  of  distorted  image,  moreover  we  must  to  make  a 
conclusion  about  the  shape  of  so-called  ideal  local  contrasting  function  Cq  (  ^  ) . 


What  will  happen  with  local  contrast  function  c^^k)  if  we  will  take  into  account  only  some  subset 
of  image  pixels?  According  to  the  image  model  (1.4),  statistical  properties  of  the  value  g^n  will 
significantly  change  depending  on  which  set  of  image  pixels  the  vicinity  belongs  to:  to  the  set  of 
pixels  from  flat  regions,  or  to  the  set  of  contour  pixels.  Correspondingly  to  that,  the  shape  of  function 
Cq  (^)  will  also  depend  upon  the  chosen  set  of  pixels:  internal  or  contour  ones. 

The  set  of  internal  pixels  of  flat  regions  possesses  the  next  two  deficiencies:  i)  statistical  properties 
of  textures  of  different  regions  may  significantly  vary  from  one  another,  hence  and  c{^k)  will  reflect 
more  these  properties,  but  not  actual  gray  scale  distortion;  ii)  due  to  comparatively  small  variation  of 

texture  component  there  is  enough  high  probability  that  the  distribution  of  for  internal  pixels  may 

contain  some  gaps,  so  c(k)  will  be  insignificant  for  these  values  of  argument. 


The  set  of  contour  pixels  looks  preferable  for  using.  Values  of  g„n  for  contour  vicinities  depend  on 


the  module  of  brightness  differences  ^s'  —s-^  |  for  closed  regions  and  on  the  width  of  contour  overfall.  Of 

course,  absolute  values  of  g„,„  also  depend  on  the  sharpness  of  the  image  and  on  its  discretisation,  but 
these  factors  affect  uniformly  and  linearly  for  all  image  pixels,  and  thus  will  be  compensated.  Under  the 
condition  that  contour  set  of  pixels  is  enough  representative,  we  may  suggest:  i)  the  probability 

?{ju=k}  {k  for  the  set  of  pixels,  belonging  to  the  contour  between  any  two  regions  f/'  and 

w  ,  is  distributed  uniformly  in  the  interval  ii)  th®  sharpness  of  contours  not  depends  on  the 
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values  s'  and  s-',  and  may  be  assumed  to  be  identical  on  the  range  of  brightness;  iii)  the  set  of  intervals 
^5' ,  j  covers  the  whole  range  of  image  brightness  [^min » ^max]  • 

From  these  suggestions  we  may  conclude,  that  for  non-distorted  high  quality  image  the  values  of 
local  contrast  function  c(k),  been  measured  for  the  contour  set  of  pixels,  will  be  uniform  for  the  whole 

range  of  image  brightness  [^min  -  ^  -  ^max]-  Thus  we  may  accept  the  hypotheses  of  uniform 
distribution  of  local  contrast  function  for  contour  image  pixel:  c(k)  =  Cq. 

Some  experiments,  been  performed  over  the  set  of  high  visual  quality  images,  demonstrate,  that  local 
contrast  function  c(k),  been  measured  for  the  set  of  contour  pixels,  is  enough  closed  to  uniform.  On  the 
other  hand  the  same  function,  been  measured  for  all  image  points,  hardly  correlate  with  image  histogram 
-  the  peaks  of  histogram  correspond  to  the  depressions  of  c(k).  It  is  induced  by  influence  of  image 
texture,  contained  inside  flat  regions. 


2.4.4.  Local  contrasts  alignment  algorithm 

Relying  on  the  hypotheses  done  above,  that  c(k)  =  Q,  and  on  equation  (2.11),  we  may  write: 
F'(k)  =  Cq  /  c{k).  Calculating  the  sum  from  0  to  ^  we  find: 


/=o<^(0 

Introducing  the  normalization  for  the  maximum  brightness:  F(Ar-l)  =  .^r-l,we  obtain: 

K-l  1 

Cq  =  {k-\)i  y - 

6)C(A:)  ■ 

Now  we  may  write  down  final  equation  for  calculating  the  gray  scale  correction  function  F{k)  for 
alignment  the  local  image  contrasts: 

=  (2,12) 

,=oc(0  k=0<^(k) 

The  proposed  algorithm  was  tested  on  several  standard  images.  The  results  demonstrate,  that  visual 
quality  of  images,  processed  by  the  algorithm  (2.12),  is  preferable  then  quality  of  the  images,  processed 
by  other  gray  scale  correction  algorithms. 


One  serious  deficiency  of  this  algorithm  consists  of  necessity  of  preliminary  detecting  of  image 
contours.  Really,  the  gray  scale  distortion  will  also  affect  to  the  results  of  contours'  extraction,  hence  the 
decision  of  the  task  may  not  be  correct.  To  overcome  this  problem  we  need  or  to  develop  some 
appropriate  adaptive  contour  detection  methods,  or  to  find  out  some  modification  of  the  algorithm,  which 
not  demand  similar  operations. 

2.4.5.  Modification  off  the  algorithm 

If  we  can't  separate  source  image  to  contour  and  flat  components  enough  precisely,  let's  try  to  use 
union  set  of  pixels  without  any  preliminary  splitting.  In  such  a  case  we  need  to  compensate  the 
deficiencies,  introduced  by  using  complex  statistics  of  the  whole  image.  As  it  was  already  discussed 
above,  if  in  (2.12)  we  take  into  account  the  whole  set  of  image  pixels,  then  the  local  contrast  functions 
c(k)  have  significant  depressions  for  the  argument  values  k  ,  been  corresponded  to  the  brightness  of 
lengthy  image  areas.  These  depressions  inverse  correlate  to  peaks  of  image  histogram,  and  in  such  a  case 
iV(v^)  in  (4)  may  be  revealed  as  the  image  histogram  H(k),  e.g.,  a  number  of  image  pixels  of 
brightness  k  . 


There  may  be  different  decisions  of  how  to  compensate  this  influence,  but  one  well  known  and 
enough  simple  variant  is  the,  introduction  of  some  compensating  component  into  the  formula  (4)  by 
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adding  relatively  small  value  S(N)  to  H(k)  in  the  denominator  {N  is  the  total  number  of  image 
pixels): 


Cif(k)  = 


1 

H{k)  +  S(N) 


(2.13) 


Such  an  addition  of  reduces  c(k)  for  small  H{k),  and  thus  compensate  the  depreciation  of 

c(  A: )  for  large  H{  k ) .  The  obtained  Cif(k)  must  be  substituted  to  (2. 12),  as  before. 


2.4.6.  Clipped  histogram  equalization 

In  some  applications  we  may  not  use  the  local  contrast  alignment  algorithm.  There  may  be  several 
reasons  for  that,  one  a  situation  may  happen  because  of  enough  specific  source  data.  For  example,  if 
captured  image  contains  brightly  illuminated  object  and  hardly  shadowed  background  without  any  details 
of  intermediate  brightness,  there  may  appear  well-expressed  internal  gaps  in  probability  distribution  of 
image  brightness.  Frequently  the  algorithm  (2.13)  is  unable  to  remove  such  gaps.  In  other  case  the 
proposed  algorithm  may  not  satisfy  us  due  to  its  relative  complexity,  for  example  if  we  need  automatic 
gray  scale  correction,  been  realized  inside  video  camera  in  a  real  time.  For  such  a  case  the  algorithm 
(2.13)  looks  to  be  too  complicated.  We  must  simplify  the  analysis  procedure  of  the  source  image 
according  to  the  limitations  of  weight,  size,  and  price  of  such  devices. 

Thinking  about  this  task  in  the  context  of  image  enhancement  one  may  recognize,  that  more  often  an 
observer  wants  to  compensate  the  deficiency  of  objects  lighting,  instead  of  to  correct  possible  non¬ 
linearity  of  registering  device.  Moreover,  when  an  image  is  used  for  a  visual  analysis,  the  observer  also 
would  like  to  get  it  with  the  best  (usually  with  maximum  possible)  visual  contrast  of  fine  details,  and 
frequently  he/she  would  be  ready  to  forgive  some  mistakes  in  differences  of  brightness  between  lengthy 
objects,  if  there  are  well  distinguished.  This  facility  may  be  used  for  working  out  comparatively  simple 
but  enough  efficient  algorithm.  The  goal  of  the  algorithm  is  to  magnify  contrast  in  representative  sections 
of  gray  scale  range  at  the  cost  of  compressing  non-used  ones,  provided  that  the  magnification  may  not 
exceed  some  boundary  value. 

We  may  formalize  the  task  in  the  following  manner.  The  probability  distribution  of  image  pixels' 
values  is  represented  by  it's  (normalized)  histogram  P{x  =  ^}  =  H{k)l  N ,  where  H{k)  is  the 
image  histogram,  and  N  is  the  total  number  of  image  pixels.  Accordingly  to  the  task,  the  whole  dynamic 
range  of  image  brightness  [0,  may  be  separated  to  a  set  of  intervals  [A^i  where  H(k)  or  has 

appreciable  non-zero  value  (i.e.,  greater  than  some  threshold  of  significance),  or  equals  to  zero  (i.e.,  is 
less  then  the  threshold).  Permitting  the  existence  of  zero-length  intervals  we  may  accept,  that  H{k) 
considered  to  be  zero  in  every  even  interval  [^2/»^2/+i]>  and  to  be  non-zero  in  every  odd  one 
[^2i+l  ’^2/+2  ]•  Now  it's  obvious,  that  to  solve  the  problem  we  may  simply  to  discard  all  even  intervals, 
then  to  join  odd  ones,  and  finally  linearly  stretch  the  derived  union  to  the  whole  region  of  values  [0,  K^. 

The  formulated  task  may  be  decided  with  the  help  of  different  algorithms,  and  we  would  like  to 
propose  enough  robust  one.  There  is  well-known  algorithm  -  histogram  equalization  [23]  -  which  has 
such  ability  to  discard  unused  region  of  brightness.  Its  deficiency  consists  of  hard  nonuniformity  of 
contrast  transformation:  the  magnification  for  the  brightness  k  is  proportional  to  the  probability  of  pixels 
of  that  value  (in  other  words  is  proportional  to  the  histogram  value  H{  A^ )).  To  be  applied  to  our  task  this 
algorithm  must  be  modified  in  the  following  manner. 


Let  there  are  given  two  thresholds:  Tq  as  the  threshold  of  significance  and  7j  as  the  threshold  of 
saturation  {Tq<T{).  If  P{A}>7J),  we  will  consider  that  the  brightness  k  is  significant  and 
corresponding  interval  will  be  added  to  odd  ones.  Otherwise,  if  P{A}  <  Tq*  h  will  be  added  to  zero-level 
(even)  intervals.  Additionally,  if  P{A}  >  Tj,  H{k)  will  be  clipped  down  to  the  level  Tq.  Thus  we  may 
write  the  next  formula  of  transformation  (clipping)  the  image  histogram  : 
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't,N,  if  H(k)<T,N-, 

H^ik)^\H(k),  if  T,N<H{k)<T,N-,  (2.14) 

if  H{k)>T,N. 

Now  it  is  possible  to  perform  the  standard  histogram  equalization  algorithm  [23]  for  the  obtained  function 
Hc(k).  The  final  equation  for  calculating  clipped  histogram  equalization  function  F{k)  will  be  the 
following: 

F(A:)  =  (i^:-l)i^,(A:)/SW).  (2.15) 

/=0  k=0 

The  advantage  of  this  formula  is  that  it  demands  very  simple  image  analysis  algorithm  -  the  only 
calculating  of  the  image  histogram  is  needed,  thus  it  may  be  successfully  used  in  real-time  applications. 

Such  a  transformation  was  experimentally  realized  inside  one  digital  video  camera  to  enhance  an 
image  in  a  real  time.  The  Histogram/Hough  Transform  Processor  chipset  L64250  (designated  and 
manufactured  by  LSI  Logic  Corporation)  was  used  to  count  the  image  histogram  H{k)  during  video 
frame.  These  data  were  transferred  to  a  host  processor  to  perform  clipping  and  equalization 
correspondingly  to  formulae  (2.14)  and  (2.15).  The  resulting  transfer  function  F{k)  was  downloaded 
back  to  the  L64250  (as  a  look-up-table)  to  process  video  signal  in  a  real  time.  The  maximum  value  of 
delay  between  video  data  analysis  and  making  the  transformation  active  not  exceeds  two  video  frames. 

Clipped  histogram  equalization  was  well  tested  both  for  static  images  and  for  real  video.  The  visual 

quality  of  output  images  was  acknowledged  as  well  acceptable. 

/ 

2.4.7.  Results  and  discussion 

Described  above  local  contrast  alignment  algorithm  has  been  sampled  by  transforming  different 
images.  There  were  tested  three  categories  of  images;  i)  originally  distorted  low-quality  images,  ii) 
standard  high-quality  images  been  artificially  degraded  by  some  gray-scale  distortion,  and  iii)  standard 
high-quality  images  in  their  initial  view.  All  the  images  were  enhanced  by  algorithm  (2.13),  and  two 
estimates  were  defined  for  each  the  image:  its  general  visual  quality,  and  its  relative  quality  in 
comparison  with  the  source  image  (better  or  worse  than  the  quality  of  original  image). 

The  global  estimate  of  applicability  of  the  algorithm  regards  it  as  well  sufficient  for  most  number  of 
images.  Originally  distorted  images  (i)  were  enhanced  both  by  general  and  by  relative  estimates. 
Degraded  images  (ii)  also  where  enhanced  relatively  to  the  source  (degraded)  copy,  but  in  many  cases 
their  quality  remains  worse  than  the  quality  of  initial  (standard,  not  degraded)  images.  The  standard 
images  (iii)  after  the  transformation  make  up  three  approximately  equal  groups:  the  group  of  images  with 
almost  identical  gray  scale  transformation,  the  group  of  definitely  changed  images  but  with  preserved 
visual  quality,  and  the  third  one  -  the  images  of  deteriorated  quality.  The  explanation  of  this  deterioration 
may  be  the  following.  High-quality  images  typically  occupy  the  whole  range  of  brightness,  and  initially 
they  already  possess  high  contrasts  between  lengthy  objects.  The  algorithm  (2.13)  often  tries  to  expand 
contrast  of  fine  details  in  shadowed  or  bright  areas,  where  original  brightness  may  be  saturated.  However 
by  expanding  the  contrast  in  some  ranges  of  gray  scale  it  will  reduce  the  contrasts  in  other’s  ones.  Thus 
the  algorithm  may  induce  the  reduction  of  contrast  between  lengthy  areas,  that  often  is  perceived  as 
deterioration  of  general  image  quality. 

We  discussed  above  only  the  gray  scale  images,  but  proposed  algorithms  also  may  be  applied  for 
enhancing  true-color  images.  To  do  that  a  color  image  must  be  represented  in  the  form  of  brightness,  hue, 
and  saturation  (if  it  is  presented  in  usual  form  of  red,  green,  and  blue  components,  it  must  be 
correspondingly  converted).  Then  the  transformation  (2.13)  has  to  be  invoked  for  the  brightness 
components  only,  and  the  other's  two  should  be  remained  as  is.  Then  the  image  must  be  inverted  back  to 
the  useful  form.  The  performed  experiments  demonstrate  the  efficiency  of  this  algorithm  for  enhancing 
visual  quality  of  true-color  images. 
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Local  contrast  alignment  algorithm  demonstrates  its  ability  to  enhance  different  kind  of  images,  and 
it  is  preferable  in  comparison  with  other  gray  scale  correction  algorithms.  In  spite  of  it  is  unable  to 
enhance  any  kind  of  source  images,  it  may  be  recommended  for  most  number  of  low-quality  images.  This 
algorithm  may  successfully  supplement  the  local  image  contrasting  algorithm  (2.4). 

The  clipped  histogram  equalization  algorithm  (2.14)  may  be  recommended  for  the  applications, 
requiring  real-time  image  processing.  Such  a  transformation  was  successfully  realized  and  tested  in 
experimental  video  camera,  and  estimating  of  output  image  quality  demonstrates  the  efficiency  of  the 
algorithm. 


2.5.  IMAGE  RESTORATION 

2.5.1.  General  principles  of  solving  the  restoration  problem 

The  image  restoration  problem  is  usually  formulated  in  the  following  way:  to  find  (to  estimate)  an 
ideal  non-degraded  image  z  from  the  equation 

Az+rix,^  =  iix,^  +  rix,)^  =  'Q.  (2.16) 

Here  ueU,  zeZ,  ^  is  a  linear  imaging  operator,  «(x,y)  is  the  noise,  and  u(x,y)  is  an  output  degraded 
image. 

The  most  universal  principles  of  solving  that  problem  are  formulated  in  the  statistical  estimation 
theory  and  in  the  theory  of  solving  ill-posed  problems  [4].  Beside  the  general  rules  there  exist  numerous 
methods  of  restoration  based  on  the  usage  of  the  specific  features  of  the  problem  (a  simple  type  of  the 
distorting  operator,  existence  of  a  known  background,  possibility  of  obtaining  a  large  number  of  images 
of  the  same  object,  etc.).  But  regardless  of  what  approach  we  use,  the  restoration  problem  can  not  be 
solved  with  the  help  of  the  so  called  empirical  methods  that  are  effectively  applied  to  solve  other 
problems  of  image  processing  (filtration,  segmentation  etc.).  Restoration  problem  is  a  typical  inverse 
problem  of  mathematical  physics  and  as  any  other  inverse  problem  can  be  solved  only  on  the  basis  of 
exact  mathematical  methods. 


2.5.2.  General  restoration  formulas 

What  ever  method  we  use  to  obtain  the  restored  image,  it  must  comply  (in  a  certain  way)  with  the 
main  equation  (2.16),  i.e.  it  must  provide  the  closeness  of  the  left  and  the  right  sides  in  this  equation.  So 
the  most  general  formulation  of  the  restoration  problem  can  de  reduced  to  the  functional  minimization: 

z\g,ri)  =  M  Pu{Az,u)  (2.17) 

Z€Z 

where  is  a  certain  metric  in  U.  To  guarantee  the  uniqueness  and  the  stability  of  the  solution,  it  is 
necessary  to  formulate  a  priory  information  about  an  ideal  image  as  a  functional  Q(z)  that  possesses 
stabilizing  properties.  In  this  case 
|z*  =  inf  p^{Az,u), 

Q(z)  <  C. 

Under  certain  conditions  the  problem  (2.18)  can  be  reduced  to  the  unconditional  extreme  problem,  in 
particular  to  Tikhonov  minimization  [4]: 

z*=  ]rd  ■  Pjj{A,tf)  +  (£)iz)\,  (2.19) 

ze  ^  ^ 

where  a  is  the  parameter  of  regularization.  Usually  it  is  assumed  that  the  ideal  image  is  a  smooth  function 
with  respect  to  Sobolev  space. 


(2.18) 


If  U  is  defined  as  Euclidean  space  with  respect  to  the  norm  (u,Bu),  where  5  is  a  positive  definite 
operator  we  obtain: 

z*  =  inf  Haz  -  u\\jg  +  aD(z)  . 
z  e  Z' 


(2.20) 
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It  should  be  noted  that  the  statistical  methods  used  in  image  restoration  result  in  extreme  problems  similar 
to  (2.20).  So  using  Bayes  strategy  or  MAP  test  we  obtain  the  optimal  estimation  in  the  form 

z*=  inf  [-\nq{Az~u)-\np{z)],  (2.21) 

zeZ 

where  p(z)  and  q(x)  are  a  priori  probability  densities  of  the  ideal  image  and  the  additive  noise 
n(x,y)={Az-u).  Using  in  (2.21)  the  most  popular  probabilistic  models  of  the  image  and  the  noise 
specifically  when  q(x)  is  Gaussian  and  p{z)  is  Gibbs  distribution,  we  obtain: 


lUz-w||J  +  t/(z)], 


z*=  inf 
Z€Z' 

where  U  is  Gibbs  potential  and  is  covariance  operator  of  the  noise  [5]. 


(2.22) 


The  main  essential  difference  between  the  regularization  method  of  image  restoration  (2.20)  and  the 
statistical  method  (2.22)  is  the  existence  of  the  regularization  parameter  in  (2.20).  It  is  necessary  to  point 
out  that  the  opportunity  of  obtaining  a  family  of  solutions  that  depend  on  a  parameter  is  very  important. 
This  allows  us  to  control  the  visual  quality  of  image  restoration  interactively  regardless  of  a  mathematical 
criterion  of  visual  image  quality. 

Usually  the  linear  algorithms  are  used  in  image  restoration.  These  algorithms  can  be  represented  in 
the  following  form 

z*=  inf  {II^z-mId +  a(z,Cz)|,  (2.23) 

z  €  Z'  ’ 

here  B  and  C  are  the  positive  definite  operators. 

We  stress  that  the  statistical  methods,  in  particular  Wiener  filtration,  lead  to  the  minimization  of  the 
same  quadratic  functional  (2.23)  assuming  that  the  image  and  the  noise  are  Gaussian,  and  a=L  In  this 

case  B  and  C  are  inverse  covariance  operators  of  the  noise  and  the  image:  5=[Cov(«)]'l,  C=[Cov(z)]'^. 
From  (2.23)  it  follows  that  linear  algorithms  differ  from  each  other  by  choice  of  operators  B  and  C.  It  can 
be  shown,  that  as  a  rule,  the  result  of  linear  restoration  does  not  depend  much  on  the  operators  B  and  C, 
i.e.,  the  result  does  not  depend  much  on  the  covariance  characteristics  of  z  and  n.  In  any  case  the  quality 
of  the  linear  restoration  always  depends  on  the  characteristics  of  degraded  operator  A  and  the  magnitude 
of  the  noise  ||«||  =s. 


We  propose  to  obtain  an  essentially  new  result  using  the  non-linear  method  of  restoration,  when  a 
priori  information  about  the  image  is  given  not  in  the  square  functional  form 


z*=  inf 
z  €  Z 


Az  -  u\g  +  aFar(z)|. 


(2.24) 


Using  the  definition  (1.10)  it  is  possible  to  obtain  the  non-linear  algorithm  of  image  restoration: 
z*  =  inf||i4z-  uf 

(2.25) 

W2  ^  Cl, 

or  under  certain  conditions: 

z*  =  \nfJ^Az-uf  +Qr,>v,  +a2W2}.  (2.26) 

So  the  decomposition  of  an  image  into  two  independent  components  can  be  effectively  used  in 
image  restoration. 
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3.  IMAGE  COMPRESSION  BASED  ON  THE  WAVELET  TRANSFORMATION 
WITH  NONLINEAR  DECOMPOSITION 

3.1.  Statistical  and  Visual  redundancy. 

The  classical  scheme  of  image  compression  (and  being  also  applicable  to  image  transmission  and 
archiving)  is  shown  on  Fig.5.  The  source  can  be  treated  as  an  application  or  a  memory  where  images  are 
fetched  from.  Compressor  reduces  redundant  information  and  sends  it  through  the  channel.  Decompressor 
restores  transmitted  signals  and  transforms  them  into  the  form  suitable  for  a  viewer.  The  models  of  a 
source  and  a  receiver  determine  the  meaning  of  “redundancy”.  The  first  one  describes  so  called  statistical 
redundancy  (S),  and  the  second  one  -  visual  redundancy  (V). 


Fig.  5.  Classical  scheme  of  information  transmission  system. 

3.1.1.  S-redundancy.  Source  models. 

The  natural  images  are  highly  structured  having  components,  which  look  like  smoothed  or  flat  areas 
surrounding  by  contours,  textures  and  noise.  Intuitively  each  of  them  separately  can  be  described  with 
reasonable  precision  by  rather  simple  models.  The  images  are  complex  compositions  of  those 
components.  The  task  of  a  source  model  is  to  decompose  the  complex  image  on  the  independent 
components  based  on  a  priory  information  about  images.  The  key  to  effective  decomposition  is  that  each 
one  of  the  signal  components  may  be  represented  efficiently  by  the  model.  So,  the  sufficient  model 
should  have  corresponding  number  of  appropriate  components  to  represent  as  many  of  them  as  needed  for 
an  image  receiver. 

The  degree  of  statistical  independence  achieved  may  be  defined  in  terms  of  entropy.  For  that  we 
have  to  compare  the  joint  entropy  among  all  the  components  with  the  sum  of  individual  entropies  of  each 
the  component  (cm  is  m-th  component): 

(3.1) 

m 

If  there  exist  statistical  dependencies  among  the  Cm,  then  the  joint  entropy  will  be  less  than  the  sum 
of  individual  entropies,  otherwise  the  two  quantities  will  be  equal.  Assume  that  there  is  some  way  to 
verify  that  the  total  information  in  the  image  (joint  entropy)  is  preserved,  then  we  can  reduce  statistical 
dependencies  by  lowering  the  individual  entropies  as  much  as  possible.  Statistical  dependence  can 

be  even  more  complicated  if  between  components  Affine  or  other  dependencies  exist.  Special  case  of 
such  images,  having  high  order  S-redundancy,  is  self  similar  images,  or  their  complex  compositions, 
called  fractals  and  multifractals.  In  this  way  due  to  the  complex  statistical  dependencies  among  the 
components,  the  higher  order  redundancy  is  transformed  into  a  simple  first  order  redundancy.  In  such  a 
case,  the  less  entropy  has  each  a  component,  the  better  is  the  performance  of  a  system  [35]. 

3.1.2  V-redundancy.  Receiver  models 

The  receiver  model  describes  visual  V-redundancy  specifying  the  components  being  not 
distinguished  from  a  background  or  other  components  by  the  receiver  in  a  certain  viewing  conditions. 
Visual  independence  also  means  that  varying  one  component  we  do  not  change  visibility  of  others.  That 
is  why  one  of  the  task  of  a  receiver  model  is  to  decompose  an  image  to  visually  independent  components. 
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The  sensitivity  of  human  vision  model  (HVS)  according  to  the  pixels  within  the  component,  usually  is 
assumed  to  be  equal  or  slowly  varying. 

It  is  known  that  the  human  vision  has  some  thresholds,  and  the  fidelity  to  image  representation, 
demanded  by  the  human  eye,  differs  from  pixel  to  pixel.  The  HVS  has  two  types  of  thresholds:  global  one 
and  local  ones  [32-34].  Global  thresholds  tune  to  the  mean  brightness  of  an  image,  and  can  slowly  change 
from  image  to  image.  Local  ones  otherwise  change  quickly  from  pixel  to  pixel  within  the  same  image. 
This  is  called  masking  effect  that  can  be  temporal  and  spatial.  Spatial  masking  increases  thresholds  for  the 
pixels  near  the  large  jump  of  brightness,  for  example  on  the  contours  of  the  objects.  Temporal  masking 
retains  the  local  masking  effect  for  a  while,  even  if  it  is  not  situated  in  the  next  frame.  The  final 
thresholds  appear  as  a  result  of  enough  complex  and  yet  unknown  interaction  between  all  these  processes. 

The  degree  of  visual  independence  achieved  can  be  assessed  as  some  consumption  of  bits  per  pixel 
(BPP)  used  for  the  individual  components  in  comparison  with  total  BPP  of  the  source  image. 

Let  U  =  =  CQ  ©  Cl  ©  ...©  cjif  is  some  certain  composition  of  components  cq 

Cl  ...  CM  and  h(c()),  h(cj),...,h(c}^^  are  corresponding  thresholds  of  HVS.  If  the  range  of  components  is 
R,  then  these  components  can  be  quantized  on  R/h(cQ)=N(cQ),  R/h(ci)=  N(ci),...,R/h(cj^=  N(cj^  levels. 
Let  us  assume  that  N(cq)  <=  N(ci)  <-...<=  N(c}f)  and  corresponding  numbars  of  BPP  are  B(c())  <= 
B(ci)  <=...<=  B(c]^.  Then  the  quantization  of  the  whole  composition  will  demand  more  long 
description  than  separate  quantization  of  the  components,  because  we  have  to  use  maximum  number  of 
levels  Nfjjox  -  max(N(Cffi)),  m  =  0,...,M,  to  be  sure  in  invisibility  of  quantization  errors  for  the  whole 
image,  i.e.: 

2;iogAr(c.)<i;iogAf.„  . c„)).  0.2) 

m  m  m 

Existence  of  threshold  properties  of  human  vision,  enables  us  to  consider  this  feature  as  a  kind  of  a 
compressor.  The  space  ofimages  is  divided  onto  the  multidimensional  cells  of  complex  shape.  Within  the 
cells  images  do  not  differ  from  each  other.  At  the  same  time  any  existing  compressor  do  the  same,  that  is 
creates  analogous  set  of  cells.  So  the  fact  that  we  see  some  impairments  on  the  decompressed  image  can 
be  explained  by  the  differences  between  the  human's  and  compressor’s  cells.  So,  each  human  vision 
model  may  be  considered  as  a  compressor,  and  in  contrast,  any  compressor  is  a  sort  of  a  vision  model. 

The  subthreshold  properties  of  a  human  vision  are  well  investigated  both  in  still  and  moving 
environment.  That  is  why  the  compression  task  can  be  solved  by  reconciling  threshold  properties  of 
compressor  with  thresholds  of  vision  [32].  Besides  of  physical  limitations  of  eye  as  itself,  there  is  a  lot  of 
uncontrolled  and  subjective  factors  leading  to  large  variety  of  estimates.  Some  users  prefer  "soft",  slightly 
smoothed  images  (methods  basing  on  the  orthogonal  transformations),  other  ones  prefer  more  "rigid" 
with  contrast  contours  (methods  basing  on  the  block  quantization).  Nevertheless,  this  task  can  be 
formalized  under  the  object  -  oriented  approach.  Here  the  main  task  is  to  formalize  an  expert's 
requirements  (for  example,  in  terms  of  probability  of  false  accept  and  false  reject). 

So,  we  consider  the  ideal  compressor  as  one  with  some  threshold  properties,  reconciled  with  the 
vision  properties  of  the  receiver.  Among  all  such  compressors  some  one,  having  best  statistical  properties 
of  quantization  cells,  will  ensure  maximal  compression. 

3.1.3.  Problem  statement. 

On  Fig.6.  typical  scheme  of  a  compressor  and  decompressor  is  presented.  An  image  is  decomposed 
by  linear  transformation  to  several  components  followed  by  quantization  and  lossless  Huffman  encoding. 
On  the  receiver  side  Huffman  decoding  and  dequantization  are  performed,  and  finally  the  reverse  liner 
transform  restores  decompressed  image. 
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Fig.6.  Typical  flow  chart  of  compressor  and  decompressor. 

Let  us  assume  that  a  source  image  consists  of  three  major  components:  smoothed  component  Ug,  contour 
one  Uc  and  residual,  noise  like  Un  one: 

U  =  U,+U„=Us+(Uc+U^).  (3.3) 

In  most  cases  the  component  does  not  carry  any  useful  information,  but  takes  rather  large  amount  of 
bits  to  describe  it.  In  contrast,  Ul  and  Uc  components  usually  contains  much  more  information  about  the 
shape  of  the  objects.  That  is  why  ideally  we  would  like  to  have  resulting  image  as  sum  of  separated  Ul 
and  Uc-  It  is  relatively  simple  to  separate  Ujj  =  Ug  component  from  the  sum  U|j  =  Ug  +  Un  using 
linear  decomposition  W.  It  is  because  the  Ul  and  Ug  components  are  in  different  frequency  area.  In 
contrast,  it  is  pot  trivial  to  separate  Uc  and  Un- 

To  separate  them,  more  complex,  in  general  case,  nonlinear  filtering  is  needed.  Primitive  nonlinear 
operation  -  quantization  is  usually  used  in  the  most  compression  schemes.  It  is  tuned  onto  the 
components  with  high  energy,  and  removes  others  with  low  energy.  This  is  in  contradiction  to  HVS 
properties.  We  can  clearly  see  low  energy  Uc  component  since  it  is  highly  structured,  in  contrast  to  Un 
component  which  looks  like  random  noise.  Simple  lowering  of  the  number  of  quantization  levels  leads  to 
fast  reducing  of  the  quality  of  decompressed  image.  From  an  expert’s  point  of  view,  when  a  power  of 
compression  increases,  it  would  be  more  reasonable  to  loose  some  other  objects  instead  of  those  of 
interest.  To  solve  this  task  it  would  be  useful  to  separate  useful  objects  from  others  and  redistribute  bits 
budget  so,  to  reproduce  them  perfectly.  In  general,  such  a  decomposition  should  rely  on  a  priory 
information  about  the  objects  of  interest.  Here  again  we  can  consider  the  decompositions  to  line  of 
images  with  different  amount  of  a  priory  information,  from  simple  contour  detection  to  highly 
“intellectual”,  recognizing  the  objects.  The  scheme  of  a  compressor  with  nonlinear  decomposition  is 
shown  on  Fig.7. 


Fig.7.  Compression  with  nonlinear  decomposition. 
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The  goal  may  be  formulated  in  the  following  way.  To  investigate  potential  reserve  of  rate/distortion 
function  of  a  compressor  improving  on  the  basis  of  chosen  linear  transformation,  and  simple  nonlinear 
image  decomposition  separating  Uc  and  components. 

3.2.  Wavelet  decomposition. 

3.2.1.  Wavelet  basis 

One  of  the  most  commonly  used  approaches  for  analyzing  a  signal  U(x)  is  to  represent  it  as  a 
weighted  sum  of  simple  building  blocks,  called  basis  functions: 

U(lt)  =  Ec.'P(x),  (3.4) 

n 

where  v|/jj(x)  are  the  basis  functions,  and  Cn  are  coefficients,  or  weights.  Since  the  basis  functions  are 
fixed,  the  information  about  the  signal  contains  only  in  the  coefficients.  There  are  many  ways  to  choose 
basis  functions.  The  simplest  one  uses  a  set  of  translated  pulses  as  the  basis,  and  it  leads  to  spatial 
representation  of  the  data.  Choosing  the  set  of  harmonic  curves  as  the  basis  functions,  results  in  Fouirier 
representation,  containing  information  in  frequency  domain  [28-31]. 

For  the  signal  compression  purposes,  neither  of  the  above  representations  is  ideal.  We  would  like  to 
have  a  representation,  which  contains  information  about  both  the  space  and  frequency  behavior  of  the 
signal.  We  want  to  know  the  frequency  content  of  the  signal  only  at  a  particular  instant  in  space. 
However,  spatial  resolution  (Ax)  and  frequency  resolution  (Aco)  cannot  be  chosen  enough  small  together 
at  the  same  time,  because  their  product  is  bounded  below  by  the  Heisenberg  inequality: 

AxA®  >  Vi.  (3.5) 

This  inequality  means  that  we  must  trade  off  time  resolution  for  frequency  resolution.  Thus,  it  is 
possible  to  have  very  good  spatial  resolution  if  you  are  willing  to  settle  low  resolution  in  frequency,  or, 
vice  versa,  it  is  possible  to  have  very  good  resolution  in  frequency  if  you  are  willing  to  settle  low  spatial 
resolution. 

By  their  nature,  low  frequency  events  are  spread  out  (or  non-local)  in  space,  and  high  frequency 
events  are  concentrated  (or  localized)  in  space.  Thus,  to  left  within  the  confines  of  the  Heisenberg 
inequality  and  to  have  sufficient  spatial-frequency  information  about  a  data,  we  have  to  construct  the 
basis  functions  like  cascaded  octave  filters,  which  repeatedly  split  the  data’s  bandwidth  in  a  half 

Let  us  constrain  all  of  the  basis  functions  in  {vi/n)  to  be  scaled  and  translated  versions  of  the  same 
prototype  function  v|/,  known  as  the  mother  wavelet.  The  scaling  is  accomplished  by  multiplying  x  by 
some  scale  factor.  If  we  choose  the  scale  factor  to  be  a  power  of  2,  yielding  m/(2^x),  where  v  is  some 
integer,  we  get  the  desirable  cascaded  octave  bandpass  filter  structure.  Because  the  function  \|/  has  finite 
support,  it  will  need  to  be  translated  along  the  space  axis  in  order  to  cover  an  entire  data.  This  translation 
is  accomplished  by  considering  all  the  integral  shifts  of  vp: 
v|/(2Vx-k),  keZ. 

(3.6) 

This  really  means,  that  we  are  translating  vp  in  steps  of  size  2*^,  because  vp  (2^x  -  k)  =  ip(2''’(x  -  2"^  k). 
Putting  this  all  together,  gives  us  a  wavelet  decomposition  of  the  signal 

uw=sl;c^'p^(x),  (3.7) 

V  k 

where  -  k)  (the  multiplication  by  2^/^  is  needed  to  be  the  basis  orthonormalized). 

The  coefficients  Cyjt  must  be  computed  by  wavelet  transform,  which  is  just  the  inner  product  of  the  signal 
U(x)  by  corresponding  basis  function  vpy;t(x).  Wavelet  may  be  implemented  as  octave  bandpass  filters, 
which  in  turn  may  be  represented  as  cascaded  implementation  of  “lowpass“  and  “highpass”  {g]^} 
filters.  Below  we  briefly  describe  the  implementation  details  of  the  wavelet-based  image  compression. 

3.2.2.  Implementation  of  the  wavelet  transformation. 

At  the  beginning  the  compressed  signal  is  decomposed  to  the  pair  of  components  with  low  and  high 
frequencies  using  the  convolution-subsampling  operators.  It  is  performed  in  the  spatial  domain  with  the 
helps  of  “lowpass”  {Ak)  and  “highpass”  {g^}  filters.  Let  H  and  G  be  such  convolution-subsampling 
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operators  realizing  these  filters,  and  H*  and  G*  be  their  adjacent  (i.e.  upsam ling-anticonvolution) 
operators.  We  may  choose  filters  of  finite  length  (Z)  and  satisfy  the  following  conditions  of  orthogonality 
and  perfect  reconstruction: 

HG*=GH’=0,  H*H+G*G  =  I,  (3.8) 

where  I  is  the  identity  operator. 

This  decomposition  process  is  iterated  on  the  low  frequency  components.  Let  U={uk}N-l(j=o  be  a 
vector  to  be  decomposed.  Then,  the  convolution-subsampling  operations  transform  the  vector  U  into  two 
subsequences  each  of  N/2  elements  length:  H\3  and  GU.  Next,  the  same  operations  are  applied  to  the 
vector  IW  to  obtain  and  GUM  of  length  N/4.  If  the  process  is  iterated  J  times,  we  have  the  discrete 

wavelet  coefficients  (GTJ,  GH\J,  Gifiv,  ...  ,GflJu,  ZT^+lU)  of  length  N.  As  a  result,  the  wavelet 
transform  analyzes  the  data  by  partitioning  its  frequency  content  dyadically  finer  and  finer  toward  the  low 
frequency  region  (i.e.,  coarser  and  coarser  in  the  original  space  domains). 

The  reconstruction  (or  synthesis)  process  is  also  very  simple:  starting  from  the  lowest  frequency 
components  (or  coarsest  scale  coefficients)  ZfJ+lU,  and  the  next  frequency  components  GffJu,  the 
adjacent  operations  are  applied  and  added  to  obtain  +  G*GH^\i.  This  process  is 

iterated  to  reconstruct  the  original  vector  U.  The  computational  complexity  of  the  decomposition  and 
reconstruction  process  is  in  both  cases  0(N)  as  easy  to  see. 

We  use  the  following  anti-aliasing  relations  between  analysis  {/rk},  (gk)  and  synthesis  {h\}, 
{g*k}  filters: 

A*k  =  (-l)"^k-l>  ^*k  =  (-l)"'^Ak-l  (3-9) 

The  filters  coefficient  is  presented  in  Table  4. 


Table  4. 


Tap 

Approximate  value 

0.85269867900940 

hi  =  h.i 

0.37740285561265 

II 

-0.11062440441842 

h3  =  h.3 

-0.0238494650119380 

II 

0.037828455506995 

5-1 

0.78848561640566 

5-2=50 

-0.41809227322221 

5-3=51 

-0.040689417609558 

5-4=52 

0.0645388826282938 

3.2.3.  Source  image 

We  assume,  that  source  images  are  captured  with  the  precision  of  8  bits/pixel.  Before  the  encoder 
will  compute  the  discrete  wavelet  transform  of  the  image,  the  samples  U(x,y)  must  be  transformed 
according  to  the  following  equation: 

U*(x,y)  =  (U(x,y)-M)/R,  (3.10) 

where  M  -  mean  value  of  the  image  pixels,  but  R  =  max  (Umax  -  M,  M  -  Umin)^^2S>  Umin  Umax 
are,  respectively,  the  minimum  and  maximum  pixel  values  in  the  image. 


3.3.  Quantization. 

3.3.1.  Subband  variance  computation 

A  subband  variance  estimate  is  calculated  basing  on  a  subregion  of  each  subband.  Let  ak(x,y)  denote 
the  floating  point  array  of  width  Xk  and  height  Yk,  comprising  the  subband.  The  width  and  height  of 
the  subregion  to  be  used  for  the  estimate,  are: 
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Xii*  =  round(3Xk/4) 

Yk*  =  round(7Yk/16). 

The  function  round(...)  denotes  rounding  of  a  floating  point  value  to  the  nearest  integer.  The  variance  will 
be  computed  by  the  unbiased  estimator: 

c7^  =  y.  A--  ^  (3.11) 

^  n=^Djk  «‘=yD.k 

where  pk  denotes  the  mean  value  of  ak.  The  horizontal  and  vertical  offsets  for  the  subregion  (xp  k  and 
yp  k),  relative  to  the  upper  left  comer,  will  be: 

X0,k  ~  |Xk/8|,  XI  k  =  XQ^k  ^ > 

(3.12) 

y0,k  =  |9Yk/32|,  yi^k  =  y0,k  +  Yk*  -  1- 


3,3.2.  Bin  width  calculation. 

The  formula  for  the  bin  widths  Qk  is: 

f  Mq,  if  k  =  0. 


Qk- 


10 


otherwise 


U*loge(of)’ 

The  zero  bin  Zk  is  computed  as  Zk  =  1.2-Qk. 


(3.13) 


3.3.3.  Quantization  procedure. 

Each  subband  is  coded  separately  according  to  a  scalar  quantizer  characteristic  having  uniform  width 
bins  with  the  exception  of  the  zero  bin,  which  is  20%  wider.  The  quantization  transforms  a  floating-point 
wavelet  coefficient  C  to  an  integer  quantizer  index  q,  that  indicates  the  quantizer  bin  in  which  C  lies.  The 
dequantization  transforms  the  index  ^  to  a  real  number  Cq,  representing  all  data  values  that  lie  within  that 
bin.  Encoder  transmits  the  values  of  Qk  (the  bin  width)  and  Zk  (the  zero  bin  width)  for  each  subband 
along  with  the  Huffman  coded  quantizer  indices.  Quantization  of  the  k^h  two-dimensional  subband, 
C^(m,n),  is  given  by  the  following: 

if  (Q:  (m,n)  >  Zk/2),  9^(m,n)  =l  (ak(m,n)  -  Zk/2)/QkJ  +  1 ; 

if  (-Zk/2  <Ck(m,n)  <  Zk/2),  =0;  (3.14) 

if  (Ck(m,n)  >  Zk/2,  qi^m,n)  4  (ok(m,n)  +  Zk/2)/Qk  1-  1  • 

The  notation  f.  1  and  L.  J  denotes  the  functions  those  round  values  to  the  next  largest  and  next  lowest 
integer,  respectively.  The  quantified  wavelet  coefficients  produced  by  the  dequantizer,  are  given  by: 

if  (qic  (m,n)  >  0),  Q*(m,n)  =  (  qk  (m,n)  -  B)Qk  +  Zk/2; 

if  (?^(m,n)=  0),  Q*(m,n)  =  0;  (3.15) 

if  (Ck(m,n)  >  Zk/2,  Q*  (m,n)  =  (  qk  (m,n)  +  B)Qk  -  Zk/2, 

where  B  is  some  parameter  between  0  and  1  that  determines  the  reconstructed  values.  Note,  that  if  B  = 
then  the  reconstmcted  value,  corresponding  to  each  quantization  bin,  would  be  the  bin’s  midpoint.  We 
used  B  =  0.44.  The  quantizer  indices  qk(rn,n)  are  transmitted  losslessly  by  a  combination  of  zero  mn- 
length  and  Huffman  encoding. 


3.3.4.  Entropy  encoding 

A  Huffman  encoder  is  used  to  assign  variable  length  codes  to  the  quantized  coefficients  within  a 
block.  Special  codes  are  provided  for  zero  runs.  Coefficients  and  zero  run  lengths  outside  the  range  of  the 
table  are  embedded  in  the  code  stream  with  an  escape  sequence.  Table  2.  lists  the  complete  symbol  set 

Table  5.  Huffman  table  of  input  symbols. 

Position  Value 

zero  run  length  1 
zero  mn  length  2 
zero  run  length  3  . 


1 

2 

3 
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100  zero  run  length  100 

1 0 1  escape  for  position  8  bit  coefficient 

102  escape  for  negative  8  bit  coefficient 

103  escape  for  position  16  bit  coefficient 

104  escape  for  negative  16  bit  coefficient 

105  escape  for  zero  run  8  bits 

106  escape  for  zero  run  16  bit 

107  coefficient  value -73 

108  coefficient  value -72 

109  coefficient  value  -7 1 


180  -  use  position  1  only  - 


253  coefficient  value  73 

254  coefficient  value  74 

As  we  are  interested  in  gain  in  r/d  characteristic,  we  do  not  implement  real  Huffman  encoding  and  use 
only  the  value  of  pseudo  entropy  H,  calculated  as: 

H pseudo  =  -l]^('w)Iog2  Pirn),  (3.16) 

m 

where  Pi-..)  is  a  histogram  of  the  symbols,  obtained  in  accordingly  to  some  accepted  standard,  before  the 
Huffman  encoding. 

3.4.  Nonlinear  decomposition 

Acceptable  nonlinear  decomposition,  separating  an  image  to  components  Uc  and  Un,  musts  rely  on 
the  vision  properties  of  the  receiver.  Our  ability  to  detect,  to  discriminate,  and  to  process  week  signals 
with  a  noise  is  based  on  a  priory  knowledge  of  signal’s  shape.  During  the  long  evolution,  our  vision 
system  produced  a  number  of  special  receptive  field,  capable  quickly  respond  to  the  known  signals.  It  is 
well  known,  that  the  longer  and  bigger  is  an  object,  the  higher  is  the  response  of  human  vision  to  them. 
Small  spot-like  objects  are  hardly  visible  even  on  the  complex  background.  It  is  because  the  sensitivity  of 
human  vision  to  such  objects  is  high.  It  can  be  partially  explained  by  the  mechanism  of  signal  summation. 
If  we  know  the  shape  of  a  signal,  we  can  use  some  special  filter,  matched  to  this  signal,  and  detect  the 
places  where  this  filter  has  large  response.  It  is  known,  that  human  vision  has  multiple  specialized 
receptive  fields  replying  only  on  the  structured  stimulus,  as  lines  differently  oriented  in  space  [33]. 
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Fig.7. 

The  response  of  these  receptive  fields  can  be  modeled  by  the  convolution  of  the  matrix  of  weighting 
coefficients  with  an  image,  i.e.: 

R  =  U®F,  (3.17) 

where  F  is  local  matrix  modeling  the  receptive  field,  symbol  0  means  convolution,  and  R  is  a  response  of 
the  receptive  field  on  a  signal  U. 

We  will  not  discuss  this  question  further,  but  will  describe  the  simple  model  of  the  receptive  field. 
Let  there  are  N  simple  receptive  fields,  tuned  on  the  special  structured  signal  (k  =  0,...,K-1).  Then  the 
behavior  of  the  complex  field  can  be  described  as  the  sum  of  the  replies  of  simple  fields  after  passing  the 
threshold  operation.  On  Fig.7.  we  can  see  such  a  structure. 

We  used  the  following  simplest  receptive  fields: 

Horizontal  direction  Vertical  direction 

Fo  Fi  F2  F3 

0  0  0  -1  -1  -1  -1  1  0  0  1  -1 
111  111  -110  01-1 

-1-1-1  0  0  0  -1  1  0  0  1  -1 

Diagonal  direction 

F4  F5  F6  F7 

10  0  f  -1  -1  0  0  1  -1-11 

-110  0  1-1  0  1-1  -110 

-1-11  0  0  1  1-1-1  10  0 


The  reply  of  this  fields  is  calculated  by 

1  1 

'Rp{k,l)  =  ^^lJ{k+m,l+n)-F{m+l,n+l),  (3.18) 

nt^-]n=-\ 

p  =  0,...,7. 

\  0,  if  R(k,l)<T 

if 

Resulting  reply  R  is  the  average  of  partial  replies 

R(t,')=7ZRr(*-0.  (3.20) 

®  p 

So,  as  a  model  of  Uc  contour  component  we  use  (3.20).  For  smoothed  Ug  component  we  can  use  local 
mean,  that  is  Ug  =  M: 


(3.18) 


(3.19) 


(3.20) 


1  '  ' 

M(M)=qSSu(A:+/w,/+«). 

It  is  evident  that  if  threshold  T  is  zero,  then  (3.20)  is  simple  linear  laplacian  R^  ,  that  is 

Ri  (^.0  =  I  Er,(*.0  =  U(*,0  - 

°  P 

On  the  borders  for  indices  k  and  /  we  use  the  following  rule: 
k=  min  (max  (0,  k+Tri)JC), 

/  =  min  (max  (0,  /+«  ),L), 

where  K  and  L  are  the  numbers  of  lines  and  columns  in  an  image. 


(3.21) 


(3.22) 
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As  was  mentioned  above,  the  nonlinear  decomposition  can  be  applied  as  before  as  after  the  linear 
decomposition.  For  the  linear  case  we  will  use  (3.20),  but  for  the  nonlinear  one,  the  following  combined 
nonlinear  filter  will  be  used: 

Rs+cm  =  Mik.l)  +  Rik.l),  (3.23) 

that  is,  if  the  reply  R(...)  equals  zero,  the  signal  value  is  substituted  by  the  local  mean  value. 


3.5.  Experimental  results. 

The  goal  of  the  experiment  was  to  reveal  further  reserves  of  rate/distortion  ratio  of  wavelet-based 
strategy  consisting  in  the  quantizing  and  the  entropy  encoding  of  wavelet  coefficients. 


To  assess  resulting  images  we  will  use  subjective  estimation  in  a  fixed  viewing  conditions  and  well- 
known  formula  for  calculating  of  PSNR: 

PSNR  =  201og„-^,  (3.24) 


where 


MSE  = 


1 


M-N 


Vmse  ’ 

2;I!(U(/«,/t)-U*(m,«))^ 


(3.25) 


and  Af  and  A^are  the  number  of  lines  and  columns  in  the  image. 


3.5.1.  Components 

The  low-pass  component  Us,  obtained  on  the  original  image,  corresponds  to  linear  decomposition.  It 
may  be  shown,  that  component  Uh  contains  both  a  mixture  of  contours  Uc  and  noise  Ujq  components. 
Applying  line^  decomposition  to  the  component  Ujj  one  can  not  separate  needed  components.  Really 

FUh  =  J?^(Uc  +  Un)  =  fUc  +  EUn  = 

(3.26) 

where  FU^^O,  but  FUc  smoothed  contour  component.  Adding  of  this  component  to  the  smoothed  Us 
results  an  image  with  reduced  resolution.  In  turn,  noise-like  component  Uh  -  FUh  contains  considerable 
part  of  contour  component.  It  means,  that  to  separate  those  components  perfectly  we  need  to  use 
nonlinear  decomposition. 


If  we  use  nonlinear  decomposition,  we  managed  to  obtain  good  contour  component  U^.  As  a  result 
we  have  perfect  restored  image  Us  +  U^.  It  is  seen  also,  that  component  Uj^  looks  really  like  noise  with 
negligible  part  of  contour  component. 

Qualitative  estimate  demonstrates,  that  nonlinear  decomposition  gives  36.823  dB,  but  linear  one  - 
only  27.3 1 5  dB. 

3.5.2.  Rate/distortion  properties. 

The  rate/distortion  results  are  presented  in  Table  6.  For  comparison,  several  images  were  compressed 
by  standard  wavelet  technology  and  proposed  one  with  different  quantization  steps. 


Table  6.  Standard  compressor  Proposed  compressor 


Bits/pixel 

PSNR 

Bits/pixel 

PSNR 

Q=10 

1.638 

34.22 

1.50 

33.98 

Q  =  20 

1.11 

30.9 

1.05 

30.2 

Q  =  30 

0.85 

29.26 

0.8 

28.9 

Q  =  40 

0.725 

28.16 

0.67 

28.3 

It  was  demonstrated,  that  nonlinear  decomposition  gives  us  perfect  separating  of  contour  and  noise 
component.  In  contrast,  linear  decomposition  with  adaptive  quantization  of  the  subbands  gives  less  value 
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of  a  gain.  Wavelet  compression  with  nonlinear  decomposition  can  be  useful  in  air  and  space  images, 
where  preserving  of  small  object  corrupted  by  noise  may  be  important. 
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4.  CONCLUSION 

An  image  was  considered  as  a  blend  of  several  statistically  and  semantically  independent 
components,  each  of  them  containing  details  of  different  information  classes  and  statistical  properties.  In 
the  above  discussion  we  pursued  the  goal  to  decompose  an  image  to  two  components:  smoothed  and 
texture  ones.  The  tasks  of  image  enhancement,  restoration,  and  compression  were  considered  from  the 
position  of  preliminary  decomposition  and  following  manipulating  with  obtained  components. 

Image  decomposition  helps  to  decide  many  different  tasks,  like  noise  removing,  enhancement  of 
gray-scale,  color,  and  multispectral  images,  extraction  of  texture,  noise-free  detection  of  contours, 
localization  of  image  details,  segmentation,  and  some  others.  We  proposed  rank  algorithms  for  image 
decomposition  and  image  enhancement,  using  statistics  for  local  window.  The  efficiency  of  algorithms 
was  verified  by  processing  of  real  and  artificial  images. 

Real  image  is  enough  complex  signal,  and  its  simplification  by  decomposing  onto  several 
information  components  looks  to  be  perspective  approach  to  the  decision  of  different  image  processing 
and  analysis  tasks.  Separating  from  an  image  only  the  component  of  interest,  we  may  get  some  new 
starting  point  for  the  decision  of  many  traditional  tasks,  the  set  of  which  is  significantly  wider  than  it  was 
discussed  above.  For  example,  extracting  texture  component  and  avoiding  average-brightness 
information,  one  can  decide  texture  image  segmentation  more  efficiently.  Varying  area  parameters  for 
relating  objects  to  one  or  another  component,  one  can  detect  objects  from  an  image  regarding  their  sizes, 
etc. 
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